ElenLeFoll, French
@ElenLeFoll@fediscience.org avatar

I am super excited about this mini-conference on in that I am organising this evening: Four of my M.A. students will be reporting on their attempts to reproduce the results of four published quantitative linguistics papers for which the data is available, but not the code!

Colleagues, they have a lot of things to report! So, if you're in the area (Cologne), do come along! There will be and Christmas biscuits! 🍵 🍪

ElenLeFoll,
@ElenLeFoll@fediscience.org avatar

As promised, here are some of the (anonymised) highlights from my students' attempts to reproduce the results of four published studies using the authors' original data, which my students brilliantly presented at our mini-conference on yesterday.

1/ One student perfectly replicated the statistics and plots for RQ1 of the paper she chose, but could not replicate RQ2 because some data for this was missing. She contacted the authors. They never replied.

ElenLeFoll,
@ElenLeFoll@fediscience.org avatar

2/ This same student also spent quite some time trying to work out what the confidence intervals on the original study's plots actually represented. She showed how she worked it out by trying various options (standard error, standard deviation, 95% confidence interval, etc.) and how much of a difference these different options make. She was surprised that the meaning of the error bars was not mentioned anywhere in the paper!

ElenLeFoll,
@ElenLeFoll@fediscience.org avatar

3/ Another student faced initial challenges because the dataset linked in the paper contained misleading variable names and some unlikely figures, e.g., for some children, the age of arrival in the L2 country was later than their age at the first data collection point which was in the L2 country... They also found some discrepancies in the descriptive statistics between their analysis of the data and those printed in the paper, suggesting that two data points were removed from the analyses.

ElenLeFoll,
@ElenLeFoll@fediscience.org avatar

4/ They also wrote to the author who replied but was unable to help. The student nonetheless managed to reproduce the inferential statistics almost perfectly except for a sizeable difference in the marginal R2 of the main linear regression model: 40% in the original study vs. 49% in the replication! We believe that this can only be explained by a typographical error. After all, 9 and 0 are close together on most keyboards...

paezha,
@paezha@mastodon.online avatar

@ElenLeFoll

Yeah, but the authors should have caught this at the stage of proofs.

Virginicus,

@ElenLeFoll This is a good pedagogical exercise, apart from the interest in reproducibility. Bravi!

otsoa,

@ElenLeFoll hi! Why code is not available? Is it a normal practice? From your experience, if asked, authors tend to share the code with other scientists or it is unlikely?

ElenLeFoll,
@ElenLeFoll@fediscience.org avatar

@otsoa Hi and thanks for asking! Sadly, sharing is far from common in . Here are some stats from Bochynska et al. 2023 (https://doi.org/10.5070/G6011239). Admittedly the most recent data is from 2018-2019, but my impression is not much has changed... 😢 I have been trying to ask for code when reviewing papers and have been told by editors that "they don't require reviewers to review code and will therefore not ask the authors for it". 😬

paezha,
@paezha@mastodon.online avatar

@ElenLeFoll @otsoa

Sharing the data and code is still far from the norm also in transportation and geography. Sometimes this is due to legitimate reasons (like a reasonable expectation of privacy), but in many other cases it is not justified.

In a recent special issue in @JGeoSys, only two of six papers were open and reproducible; see:

https://rdcu.be/duwaF

  • All
  • Subscribed
  • Moderated
  • Favorites
  • linguistics
  • ngwrru68w68
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • anitta
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • provamag3
  • tester
  • Leos
  • megavids
  • JUstTest
  • All magazines