benmarwick

@benmarwick@mastodon.social

#Rstats enthusiast, Professor of #Archaeology at the University of Washington

#Evolution, #ReproducibleResearch, #OpenScience, #Datascience

This profile is from a federated server and may be incomplete. Browse more on the original instance.

benmarwick, 12 days ago to random

The Journal of Archaeological Science (JAS) now does a "reproducibility review" as part of the submission process: https://www.elsevier.com/connect/how-reproducibility-is-gaining-first-class-status-in-scientific-research If your manuscript mentions using an open source programming language, it will get this review, in addition to the normal peer reviews.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ simon_brooke, gvwilson, joeroe

benmarwick, 1 month ago to random

Here is our poster "Careers in Ruins: Academic Archaeology Job Trends From 2013 - 2023"

It was presented today at #SAA2024 #saa2024nola by
#UniversityOfWashington
undergrad Anne Poole. Co-authors include Ailin Zhang, Setareh Shafizadeh & Jess Beck.

Data & #rstats code at https://github.com/benmarwick/archyjobads

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ joeroe

johannes_lehmann, 1 month ago to Futurology

Where would you share code (R/quarto) underlying analyses in a research manuscript?

I was thinking Zenodo as I planned to host the non-sequencing data there. I saw that Zenodo interfaces with GitHub, which many papers use to host their code and which the publisher guide lists first. Any benefit to also putting the code on GitHub (I didn’t use git for version control & the code is only useful for reproducing the analyses, not itself innovative)?

#OpenScience #research #code #Rstats #GitHub

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 1 month ago

@jperkel @johannes_lehmann yes Zenodo is an excellent option to ensure the long term availability of code & data in an easy to cite form (because it gives a DOI).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 1 month ago

@jperkel @johannes_lehmann As many of the other replies note, GitHub is also a good place to host research materials because it has features that Zenodo lacks. For example, it's much easier for users to search and browse your materials there, and to communicate with you via the issue tracker. You can also integrate your materials easily with other services like continuous integration and Binder.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 1 month ago

@jperkel @johannes_lehmann Many people develop code on GitHub, then copy it to Zenodo for long term archiving, get the DOI and cite that in their paper as the canonical link to the version of the code that matches the paper. They might continue developing the code on GitHub. So they have both a GitHub repo and a Zenodo deposit. I think this is a great way to share research materials so readers can feel confident about what they're looking at.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Binder

benmarwick, 5 months ago to random

"Best Paper awards miss the global calls for greater transparency and equitable access to academic recognition" https://www.biorxiv.org/content/10.1101/2023.12.11.571170v1 and https://www.nature.com/articles/d41586-023-04027-w.epdf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ArchaeoBasti

benmarwick, 6 months ago to random

The Journal of Archaeological Science has just introduced a 'Reproducibility Prize for papers that share in a transparent, clear and detailed manner their data, protocols and/or code', and announced the first winners. Congratulations Andrew McLean and @xrubio!

What a fantastic initiative to recognise the huge effort that researchers put into their work to make it open. I hope other journals will similarly encourage this commitment to reproducibility.

https://www.sciencedirect.com/journal/journal-of-archaeological-science/about/jas-reproducibility-prize

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ gratefulwolf, joeroe, ma_delsuc

benmarwick, 7 months ago to random

I spoke today in the workshop 'Socio-Politics of Knowledge Production in Anthropology & Archaeology' organised by Jess Beck and Laura Heath-Stout

My slides, #rstats code & data are here: https://github.com/benmarwick/wg-sociopolitics-of-knowledge-production-in-archaeology-workshop

Here are some of the key findings...

reply

expand (13)

collapse (13)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ RichardNevell, joeroe

benmarwick, 7 months ago

My question was: do 'high prestige' journals have more evidence of questionable research practices (QRPs)? Can we spot any signs of researchers quest for status and minimizing reputational risk, or other factors involved in low levels of using and sharing code and data?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

To answer this I looked at distributions of p-values in articles in two journals that are very similar, but with different Impact Factors as a proxy for prestige.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

I focussed specifically on p-hacking as a relevant QRP that is relatively straightforward to investigate

I followed the approach of Hirschauer, et al (2017). Pitfalls of significance testing and p-value variability: Implications for statistical inference, https://www.uni-goettingen.de/de/document/download/eab4afb417c371b5c39c9e6f30189a90.pdf/Musshoff_07_2017.pdf and Head, @RobLanfear et al (2015). The Extent and Consequences of P-Hacking in Science. PLOS Biology, 13(3), e1002106. https://doi.org/10.1371/journal.pbio.1002106

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

I used the caliper test to see if there was an unusually high number of p-values just below 0.05, that might suggest p-hacking to get significant results where none really exist.

I followed the example of @chartgerink (2016). Distributions of p-values smaller than .05 in psychology: What is going on? PeerJ, 4, e1935. https://doi.org/10.7717/peerj.1935

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

Querying the Elsevier API with a request for 1000 papers from each journal, I got a roughly equal number of papers and p-values from the two journals.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

Here’s a look at the distribution of p-values from 0 to 1 in our ~1000 articles in the two journals. Several interesting results here:

The distribution is highly skewed, this suggests that archaeological data has evidential value. That’s good

It also suggests that we might have publication bias, non-significant results are not published. That’s not good

There are periodic peaks that probably result from rounding of p-values, for example to 0.01, 0.02, 0.03, etc.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

Zooming in on the region where p=0.05, we see very high number of p-values just under 0.05, far more than at other ‘whole numbers’ in that range which suggests more just just simple rounding is going on here.

Here’s the results of our caliper test, which is a binomial test of counts of p-values in two bins, one right near 0.05 and another right near 0.04. We can see from the results of the binomial test that the 0.05 bin has vastly more p-values in it than expected. That’s not good news.

image/png

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

These data do support the more prestigious journal having a worse case of p-hacking. The more minuscule p-value can be interpreted as stronger evidence for p-hacking at 0.05.

Are researchers doing data manipulation, for example, removing data from their analysis, so they get p < 0.05?

Or are they leaving their data untouched, doing the tests, then rounding down their p-values to below 0.05?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

To tackle these questions I used the https://github.com/MicheleNuijten/statcheck #rstats package by @MicheleNuijten et al. This scans text for statistical test results that are presented using the format recommended by the American Psychological Association. Statcheck will take the test statistic and df, and recompute the p-value, and then check the published p-value against what it has computed.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

I hypothesized that if archaeologists are manipulating data, then the APA-formatted result of that test will be correct, or maybe have some minor inconsistency due to rounding.

On the other hand, if archaeologists are conducting a statistical test, and then after they get the result they round the p-value then that is detectable as a ‘decision error’

In the JASR papers, we have 23 papers that contained APA-formatted statistical tests, only 1 had a decision error. Pretty good!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

But in the high prestige journal, statcheck found 2 papers that had APA-formatted statistical tests, a total of 3 tests. And they were all correctly reported.

A key finding here is how few statistical tests are reported in JAS a standardized way that enable verification.

Could this be a strategy by authors to do p-hacking undetected?

For broader context, I looked at 3000 PLOS One papers on archaeology, statcheck found 587 statistical tests with 28 having decision errors.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

To summarise, our initial hypothesis seems to be supported by the data. The high prestige journal has signs of worse QRP than the low prestige journal.

A surprise finding is how statistical tests in the high prestige journal were so resistant to further investigation using statcheck.

Neither of these journals have any instructions to author on how to format statistical tests, which is a convenient for controlling that variable, but hopefully that will change soon! What else should we do?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

benmarwick, 7 months ago

In addition to giving authors instructions on how to communicate their statistical test results, journals could include data and code in the peer review process.

An Associate Editor for Data could coordinate review of these materials as part of the peer review process. Great example of this here: https://doi.org/10.1061/(ASCE)WR.1943-5452.0001368

Finally, prestige metrics such as IF should be de-emphasized in in professional evaluations, removing the motivation for questing for high status publications

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ joeroe

benmarwick, 7 months ago to random

This looks like it could work for archaeology journals, and many other fields also:

"Implementing code review in the scientific workflow: Insights from ecology and evolutionary biology"

https://onlinelibrary.wiley.com/doi/full/10.1111/jeb.14230

image/png
image/png

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ LegalizeBrain, ma_delsuc

benmarwick, 7 months ago to random

New paper! "Women in the Lab, Men in the Field? Correlations between Gender and Research Topics at Three Major Archaeology Conferences":

https://www.tandfonline.com/eprint/S6XGCSD7K8IFUB3BD6SI/full?target=10.1080/00934690.2023.2261083

Open access pre-print: https://osf.io/fsujr

Code & data: https://doi.org/10.17605/OSF.IO/ZFB36

image/jpeg
image/jpeg

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ joeroe

benmarwick, 10 months ago to random

Great short read by @jperkel on sharing scientific data, relevant to any field

More: https://www.readcube.com/nde/eyJhbGciOiJFUzI1NiJ9.eyJkb2kiOiIxMC4xMDM4L25kZS8wMDI4MDgzNi8yMDIzLzYxOC83OTY3In0.Zc9eFj20S6ROw1psyU1cGuKZyFYygVa5-ueygfZYajaypkl9aM89WtHz0CkZr6WYyvJMFruBrRJRnBLOOBx01w

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ttpphd