Does anyone else maintain #changelog (s) for their #computer (s)?
I enter all configuration adjustments and #update (s) in a #markdown file for each machine.
This might seem like unnecessary extra work, but has paid off several times for the sake of traceability or #reproducibility in the past. 🤓 #musicproduction#linuxaudio
From a Wash Post article on evidence humans were in N. America earlier than previously thought. I myself have a mixed-feelings middle-ground view on peer review, but I'm in a very different field.
"The peer-review process is designed to help validate scientific claims, but Lowery argues that in archaeology it often leads to a circle-the-wagon mentality, allowing scientists to wave away evidence that doesn’t support the dominant paradigm. He says he isn’t seeking formal publishing routes because “life’s too short,” comparing this aspect of academic science to “the dumbest game I’ve ever played.”"
#ML#Science#Transparency#Reproducibility: "Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear recommendations for conducting and reporting ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (recommendations for machine-learning-based science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed on the basis of a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility." https://www.science.org/doi/10.1126/sciadv.adk3452
During meeting no.9, we'll learn how to streamline your analysis using {targets} package, based on chapter 13 of “Building reproducible analytical pipelines with R” by Bruno Rodrigues (https://raps-with-r.dev/targets.html).
One important lesson from the #xz situation is that we should not allow binary blobs to enter the build process because they can't be audited. (In the case of xz-utils, most of the malicious code was hidden in a binary test archive.)
Some time ago, I have banned all binaries from the revision control system used for our papers. That means no PDFs, PNGs, etc. In our case, it's not malicious code but #reproducibility; nevertheless, the challenges are quite similar. (1/3)
April 4th we'll learn about how to package your code, based on chapter 11 of “Building reproducible analytical pipelines with R” by Bruno Rodrigues (https://raps-with-r.dev/packages.html)
Watching with interest: "Backed by 250,000 Swiss francs, or roughly $285,000,… from the University of Bern, [a new program] will pay reviewers to root out mistakes in influential papers, beginning with a handful in #psychology. The more errors found, and the more severe they are, the more the sleuths stand to make." https://www.chronicle.com/article/wanted-scientific-errors-cash-reward
(#paywalled)
I wish more of the #reproducibility discussions in #rstats focussed on ensuring that analysis could be repeated with current/maintained R packages (and system libraries) rather than just taking a snapshot of your current environment. Fixing your environment is just accruing technical/analytical debt so "production" use cases aside (interpret "production" how you wish), I don't think it is as useful for research/science as some make out. There is of course more nuance to this but still ... 1/1
Just learned that I needed to go back a version in #Rstats to keep using the groundhog 📦 and load versions of packages that I was using. Future me better appreciate this... #reproducibility
I love the renv #rstats package for #reproducibility.
But I often receive feedback from frustrated colleagues who get package installation errors.
I have written a blog post to dive more into the issues that can arise when using renv, where they come from (spoiler: it's not renv's fault), and possible solutions.
It allows you to predefine the environment (e.g., specific python or R version) and then boot up an entire terminal with those prepackaged. This ensures you run code with the appropriate versions preloaded!*
*still in the process of testing and caveats will surely follow
I make it part of my job to educate my graduate students about open access, open-source software, open data, practices for #reproducibility, and #OpenScience