Published July 2, 2019 | Version v1
Journal article Open

Meta-Research: Centralized scientific communities are less likely to generate replicable results

  • 1. University of Chicago

Description

Concerns have been expressed about the robustness of experimental findings in several areas of science, but these matters have not been evaluated at scale. Here we identify a large sample of published drug-gene interaction claims curated in the Comparative Toxicogenomics Database (for example, benzo(a)pyrene decreases expression of SLC22A3) and evaluate these claims by connecting them with high-throughput experiments from the LINCS L1000 program. Our sample included 60,159 supporting findings and 4253 opposing findings about 51,292 drug-gene interaction claims in 3363 scientific articles. We show that claims reported in a single paper replicate 19.0% (95% confidence interval [CI], 16.9–21.2%) more frequently than expected, while claims reported in multiple papers replicate 45.5% (95% CI, 21.8–74.2%) more frequently than expected. We also analyze the subsample of interactions with two or more published findings (2493 claims; 6272 supporting findings; 339 opposing findings; 1282 research articles), and show that centralized scientific communities, which use similar methods and involve shared authors who contribute to many articles, propagate less replicable claims than decentralized communities, which use more diverse methods and contain more independent teams. Our findings suggest how policies that foster decentralized collaboration will increase the robustness of scientific findings in biomedical research.

Data availability

The datasets generated or analysed during this study are included in the manuscript and supporting files and have been made available at OSF (http://dx.doi.org/10.17605/OSF.IO/XMVDA).

The following data sets were generated:

Valentin Danchev, Andrey Rzhetskym James A Evans (2019) Open Science Framework Centralized scientific communities are less likely to generate replicable results. https://doi.org/10.17605/OSF.IO/XMVDA

The following previously published data sets were used:

The Broad Institute (2015) NCBI Gene Expression Omnibus ID GSE70138. L1000 Connectivity Map perturbational profiles from Broad Institute LINCS Center for Transcriptomics LINCS PHASE *II* (n=354,123; updated March 30, 2017). https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70138

MDI Biological Laboratory NC State University (2016) Comparative Toxicogenomics Database Comparative Toxicogenomics Database. http://ctdbase.org/downloads/

Files

elife-43094-v1.pdf

Files (10.1 MB)

Name Size Download all
Article
md5:c3b8a7475e79c08fdeb12d0c0c9a3e29
819.2 kB Preview Download
md5:fd5afd6faccd8ee94f025fe37c6a25fa
9.2 MB Preview Download

Additional details

Identifiers

DOI
10.7554/eLife.43094
Other
oai:uchicago.tind.io:9882

Funding

Defense Advanced Research Projects Agency
Big Mechanism
National Science Foundation
SciSIP
Air Force Office of Scientific Research
FA9550-15-1-0162

UChicago Information

Division(s)
Biological Sciences Division, Social Sciences Division
Department(s)
Human Genetics, Medicine, Sociology
Center(s) or Institute(s)
Institute for Genomics and Systems Biology