I published my first academic paper thirty years ago.
Overnight, a researcher contacted me to ask if I had the data for the main figure in the paper so that they could reproduce it.
Could I help? The data was from the time before Windows, 5.25-inch floppy discs, and graphs sent from a Unix environment directly to a printer for redrawing for publication by a cartographer.
You bet! In my archive, there was the original input data and code (Fortran77). And now shared.
Using the #kaggle CLI in my https://noteable.io notebook is super easy by adding 2 secret environment variables! This avoids the need to download to my computer and upload again 🙏
Do you use Continuous Integration in your #bioinformatics or #datascience projects or know of projects that do? If so, can you provide the link to the example? If not, why? Do you feel it would take too much time?
Die Slides, Notizen, Code und Vertiefungshinweise für meinen Vortrag "Legal Data Science: Der moderne Weg zur Wahrheit" bei der Digitalen Richterschaft sind jetzt online!
I helped my wife with some pre-course application last night (she's applying for a #DataScience bootcamp).
Later, she said it was nice to "see me use my brain". It wasn't an insult, all my previous jobs have been quite mindless, whereas her job is very challenging.
I then started reading my new book: The Computers That Made Britain from @Raspberry_Pi press. I got stuck on a word in the second paragraph 🤣🤣
How Academic Bullying Led This Data Scientist to Open Science.
In her insightful article, @pcmasuzzo shares how she was bullied into producing bad science, how she became fed up with academic culture and what she had to do to rediscover her love for science.
"Animal face-off: Pandas vs. Polars – The battle of data analysis libraries in Python" - Arturo Regalado https://pythonaberdeen.github.io/
In this talk, Arturo will present a relatively new data analysis library: Pola-rs. The new library claims to be a lightning-fast data frame library and a substitute to the industry standard Pandas. He will provide data analysis examples and challenges to discover the use cases of each library. #Python#DataScience#Pandas
For the evening crowd
Is it time for a change..? New positions advertised, full time and permanent, working right at the cutting edge of both basic climate and ice sheet research, and importantly, climate services, including drought, #Attribution + #SeaLevelRise.
Mainly stuff about British comics (ok it's #2000AD), #retrogaming from the early to mid 80s UK microcomputer boom, popular science / rationalism, will share the UK popular cultural archive but not a poem remembering the Milk In A Bottle, #DoctorWho always, #DataAnalysis and #DataScience for work and sometimes for fun. Formerly @mstdn.social, migrated to @mastodonapp.uk 26/04/23 -- I find the Local timeline is a bit more UK-relatable #Introduction
Posit announced this week that the Shiny #Python 🐍 version is moving out from alpha stage to general availability. The Shiny package is one of the great tools in R for building interactive and complex dashboards without #JS or #HTML knowledge. It has a HUGE ecosystem, mainly due to community contribution, and it is great to see it expanding to the Python community.
Next level :python: :qgis: #DataScience proof of concept:
Turned a simple QGIS Processing model into a #DVC data-versioned #geoprocessing workflow
"Why is this great?" you may ask ❓
DVC tracks data, parameters, and code. If anything changes, we simply rerun the process and DVC will figure out which stages need to be recomputed and which can be skipped by re-using cached results.
This can lead to huge time savings compared to re-running the whole model 👩💻🥳