@benmarwick@mastodon.social avatar

benmarwick

@benmarwick@mastodon.social

#Rstats enthusiast, Professor of #Archaeology at the University of Washington

#Evolution, #ReproducibleResearch, #OpenScience, #Datascience

This profile is from a federated server and may be incomplete. Browse more on the original instance.

benmarwick, to random
@benmarwick@mastodon.social avatar

The Journal of Archaeological Science (JAS) now does a "reproducibility review" as part of the submission process: https://www.elsevier.com/connect/how-reproducibility-is-gaining-first-class-status-in-scientific-research If your manuscript mentions using an open source programming language, it will get this review, in addition to the normal peer reviews.

benmarwick, to random
@benmarwick@mastodon.social avatar

Here is our poster "Careers in Ruins: Academic Archaeology Job Trends From 2013 - 2023"

It was presented today at by

undergrad Anne Poole. Co-authors include Ailin Zhang, Setareh Shafizadeh & Jess Beck.

Data & code at https://github.com/benmarwick/archyjobads

johannes_lehmann, to Futurology
@johannes_lehmann@fediscience.org avatar

Where would you share code (R/quarto) underlying analyses in a research manuscript?

I was thinking Zenodo as I planned to host the non-sequencing data there. I saw that Zenodo interfaces with GitHub, which many papers use to host their code and which the publisher guide lists first. Any benefit to also putting the code on GitHub (I didn’t use git for version control & the code is only useful for reproducing the analyses, not itself innovative)?

benmarwick,
@benmarwick@mastodon.social avatar

@jperkel @johannes_lehmann yes Zenodo is an excellent option to ensure the long term availability of code & data in an easy to cite form (because it gives a DOI).

benmarwick,
@benmarwick@mastodon.social avatar

@jperkel @johannes_lehmann As many of the other replies note, GitHub is also a good place to host research materials because it has features that Zenodo lacks. For example, it's much easier for users to search and browse your materials there, and to communicate with you via the issue tracker. You can also integrate your materials easily with other services like continuous integration and Binder.

benmarwick,
@benmarwick@mastodon.social avatar

@jperkel @johannes_lehmann Many people develop code on GitHub, then copy it to Zenodo for long term archiving, get the DOI and cite that in their paper as the canonical link to the version of the code that matches the paper. They might continue developing the code on GitHub. So they have both a GitHub repo and a Zenodo deposit. I think this is a great way to share research materials so readers can feel confident about what they're looking at.

benmarwick, to random
@benmarwick@mastodon.social avatar

"Best Paper awards miss the global calls for greater transparency and equitable access to academic recognition" https://www.biorxiv.org/content/10.1101/2023.12.11.571170v1 and https://www.nature.com/articles/d41586-023-04027-w.epdf

benmarwick, to random
@benmarwick@mastodon.social avatar

The Journal of Archaeological Science has just introduced a 'Reproducibility Prize for papers that share in a transparent, clear and detailed manner their data, protocols and/or code', and announced the first winners. Congratulations Andrew McLean and @xrubio!

What a fantastic initiative to recognise the huge effort that researchers put into their work to make it open. I hope other journals will similarly encourage this commitment to reproducibility.

https://www.sciencedirect.com/journal/journal-of-archaeological-science/about/jas-reproducibility-prize

benmarwick, to random
@benmarwick@mastodon.social avatar

I spoke today in the workshop 'Socio-Politics of Knowledge Production in Anthropology & Archaeology' organised by Jess Beck and Laura Heath-Stout

My slides, code & data are here: https://github.com/benmarwick/wg-sociopolitics-of-knowledge-production-in-archaeology-workshop

Here are some of the key findings...

benmarwick,
@benmarwick@mastodon.social avatar

My question was: do 'high prestige' journals have more evidence of questionable research practices (QRPs)? Can we spot any signs of researchers quest for status and minimizing reputational risk, or other factors involved in low levels of using and sharing code and data?

benmarwick,
@benmarwick@mastodon.social avatar

To answer this I looked at distributions of p-values in articles in two journals that are very similar, but with different Impact Factors as a proxy for prestige.

benmarwick,
@benmarwick@mastodon.social avatar

I focussed specifically on p-hacking as a relevant QRP that is relatively straightforward to investigate

I followed the approach of Hirschauer, et al (2017). Pitfalls of significance testing and p-value variability: Implications for statistical inference, https://www.uni-goettingen.de/de/document/download/eab4afb417c371b5c39c9e6f30189a90.pdf/Musshoff_07_2017.pdf and Head, @RobLanfear et al (2015). The Extent and Consequences of P-Hacking in Science. PLOS Biology, 13(3), e1002106. https://doi.org/10.1371/journal.pbio.1002106

benmarwick,
@benmarwick@mastodon.social avatar

I used the caliper test to see if there was an unusually high number of p-values just below 0.05, that might suggest p-hacking to get significant results where none really exist.

I followed the example of @chartgerink (2016). Distributions of p-values smaller than .05 in psychology: What is going on? PeerJ, 4, e1935. https://doi.org/10.7717/peerj.1935

benmarwick,
@benmarwick@mastodon.social avatar

Querying the Elsevier API with a request for 1000 papers from each journal, I got a roughly equal number of papers and p-values from the two journals.

benmarwick,
@benmarwick@mastodon.social avatar

Here’s a look at the distribution of p-values from 0 to 1 in our ~1000 articles in the two journals. Several interesting results here:

  • The distribution is highly skewed, this suggests that archaeological data has evidential value. That’s good
  • It also suggests that we might have publication bias, non-significant results are not published. That’s not good
  • There are periodic peaks that probably result from rounding of p-values, for example to 0.01, 0.02, 0.03, etc.
benmarwick,
@benmarwick@mastodon.social avatar

Zooming in on the region where p=0.05, we see very high number of p-values just under 0.05, far more than at other ‘whole numbers’ in that range which suggests more just just simple rounding is going on here.

Here’s the results of our caliper test, which is a binomial test of counts of p-values in two bins, one right near 0.05 and another right near 0.04. We can see from the results of the binomial test that the 0.05 bin has vastly more p-values in it than expected. That’s not good news.

image/png

benmarwick,
@benmarwick@mastodon.social avatar

These data do support the more prestigious journal having a worse case of p-hacking. The more minuscule p-value can be interpreted as stronger evidence for p-hacking at 0.05.

Are researchers doing data manipulation, for example, removing data from their analysis, so they get p < 0.05?

Or are they leaving their data untouched, doing the tests, then rounding down their p-values to below 0.05?

benmarwick,
@benmarwick@mastodon.social avatar

To tackle these questions I used the https://github.com/MicheleNuijten/statcheck package by @MicheleNuijten et al. This scans text for statistical test results that are presented using the format recommended by the American Psychological Association. Statcheck will take the test statistic and df, and recompute the p-value, and then check the published p-value against what it has computed.

benmarwick,
@benmarwick@mastodon.social avatar

I hypothesized that if archaeologists are manipulating data, then the APA-formatted result of that test will be correct, or maybe have some minor inconsistency due to rounding.

On the other hand, if archaeologists are conducting a statistical test, and then after they get the result they round the p-value then that is detectable as a ‘decision error’

In the JASR papers, we have 23 papers that contained APA-formatted statistical tests, only 1 had a decision error. Pretty good!

benmarwick,
@benmarwick@mastodon.social avatar

But in the high prestige journal, statcheck found 2 papers that had APA-formatted statistical tests, a total of 3 tests. And they were all correctly reported.

A key finding here is how few statistical tests are reported in JAS a standardized way that enable verification.

Could this be a strategy by authors to do p-hacking undetected?

For broader context, I looked at 3000 PLOS One papers on archaeology, statcheck found 587 statistical tests with 28 having decision errors.

benmarwick,
@benmarwick@mastodon.social avatar

To summarise, our initial hypothesis seems to be supported by the data. The high prestige journal has signs of worse QRP than the low prestige journal.

A surprise finding is how statistical tests in the high prestige journal were so resistant to further investigation using statcheck.

Neither of these journals have any instructions to author on how to format statistical tests, which is a convenient for controlling that variable, but hopefully that will change soon! What else should we do?

benmarwick,
@benmarwick@mastodon.social avatar

In addition to giving authors instructions on how to communicate their statistical test results, journals could include data and code in the peer review process.

An Associate Editor for Data could coordinate review of these materials as part of the peer review process. Great example of this here: https://doi.org/10.1061/(ASCE)WR.1943-5452.0001368

Finally, prestige metrics such as IF should be de-emphasized in in professional evaluations, removing the motivation for questing for high status publications

benmarwick, to random
@benmarwick@mastodon.social avatar

This looks like it could work for archaeology journals, and many other fields also:

"Implementing code review in the scientific workflow: Insights from ecology and evolutionary biology"

https://onlinelibrary.wiley.com/doi/full/10.1111/jeb.14230

image/png
image/png

benmarwick, to random
@benmarwick@mastodon.social avatar

New paper! "Women in the Lab, Men in the Field? Correlations between Gender and Research Topics at Three Major Archaeology Conferences":

https://www.tandfonline.com/eprint/S6XGCSD7K8IFUB3BD6SI/full?target=10.1080/00934690.2023.2261083

Open access pre-print: https://osf.io/fsujr

Code & data: https://doi.org/10.17605/OSF.IO/ZFB36

image/jpeg
image/jpeg

benmarwick, to random
@benmarwick@mastodon.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • tacticalgear
  • cubers
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • osvaldo12
  • ngwrru68w68
  • GTA5RPClips
  • provamag3
  • InstantRegret
  • everett
  • Durango
  • cisconetworking
  • khanakhh
  • ethstaker
  • tester
  • anitta
  • Leos
  • normalnudes
  • modclub
  • megavids
  • lostlight
  • All magazines