We are 133 people now (68 active), with increased growth in the past few weeks. Our server hosts a community for #DataScience, broadly defined. See below our place in the universe of servers.
Just four weeks unti our Health Hackathon on 17-18 June! Our number of attendees signed up has more than doubled in the last week. And even more challenges are being added each week.
Thanks to NHS Grampian and University of Aberdeen for sponsoring the event and to Robert Gordon University for participating too. Also thanks to ONE Tech Hub for hosting us! See you there!
I published my first academic paper thirty years ago.
Overnight, a researcher contacted me to ask if I had the data for the main figure in the paper so that they could reproduce it.
Could I help? The data was from the time before Windows, 5.25-inch floppy discs, and graphs sent from a Unix environment directly to a printer for redrawing for publication by a cartographer.
You bet! In my archive, there was the original input data and code (Fortran77). And now shared.
Using the #kaggle CLI in my https://noteable.io notebook is super easy by adding 2 secret environment variables! This avoids the need to download to my computer and upload again 🙏
Do you use Continuous Integration in your #bioinformatics or #datascience projects or know of projects that do? If so, can you provide the link to the example? If not, why? Do you feel it would take too much time?
@MrHedmad automated or CI for ad-hoc analyses, no. I am testing it as I go. But I do rely on tools that have been tested via CI, no need to retest those. There are cases where you need to code the tests, but mostly as assertions that run as you run your analysis. For example if you randomly split some data and want to check both dataset have correct properties. Mileage can vary. But unless you semi automate or completely automate there is little need for CI.
Die Slides, Notizen, Code und Vertiefungshinweise für meinen Vortrag "Legal Data Science: Der moderne Weg zur Wahrheit" bei der Digitalen Richterschaft sind jetzt online!
I helped my wife with some pre-course application last night (she's applying for a #DataScience bootcamp).
Later, she said it was nice to "see me use my brain". It wasn't an insult, all my previous jobs have been quite mindless, whereas her job is very challenging.
I then started reading my new book: The Computers That Made Britain from @Raspberry_Pi press. I got stuck on a word in the second paragraph 🤣🤣
How Academic Bullying Led This Data Scientist to Open Science.
In her insightful article, @pcmasuzzo shares how she was bullied into producing bad science, how she became fed up with academic culture and what she had to do to rediscover her love for science.
"Animal face-off: Pandas vs. Polars – The battle of data analysis libraries in Python" - Arturo Regalado https://pythonaberdeen.github.io/
In this talk, Arturo will present a relatively new data analysis library: Pola-rs. The new library claims to be a lightning-fast data frame library and a substitute to the industry standard Pandas. He will provide data analysis examples and challenges to discover the use cases of each library. #Python#DataScience#Pandas