Raising the Reproducibility Bar

Raising the Reproducibility Bar

Joseph Wonsil, Rúbia Guerra, Adam Pocock, Jack Sullivan, Margo Seltzer

29 July 2025

Current reproducibility research addresses the challenge of reexecution (executing an artifact someone else prepared), and it has chipped away at the reproducibility crisis and revealed a comprehensibility crisis. Our community is uniquely positioned to address this crisis by focusing on the question of why we care about reproducibility: to validate and peer review others’ work. The introduction of computation to the scientific method added new burdens to scientists, as they now need domain knowledge and computational knowledge to review a study successfully. We can relieve them of that burden by providing push-button reproducibility, but we then inadvertently remove the “burden” of needing to understand how the analysis works. An opaque experiment that deterministically spits out the exact same number across different platforms, even when operated by other users, is still an opaque experiment. Ideally, a research artifact should produce a roughly deterministic outcome and demonstrate that the computation matches the methodologies and analyses claimed in its corresponding publication. We propose embracing comprehensibility as a necessary facet of reproducibility and call upon our community to explore how we can improve the comprehensibility of computational experiments. We suggest research directions using emerging technologies such as LLMs combined with existing research in provenance and virtualization to enable more scientists to generate comprehensible artifacts.


Venue : ACM REP 2025

File Name : DRAFT_2025_04_02_Raising_the_Reproducibility_Bar.pdf