Benefits of Data Management, Sharing, and Reuse
Accelerates scientific progress: Data sharing allows researchers to access and understand
others’ data and re-use them for their own scientific purposes, thereby speeding up the rate of
new discoveries, and preventing unnecessary expensive data collection.
• Numerous research organizations in Canada have recognized this potential and have
adopted open research practices (e.g., Tanenbaum Open Science Institute (McGill U.),18
Centre for Biodiversity Genomics (U. Guelph)).19
• Milham et al. (2018) use the International Neuroimaging Data-Sharing Initiative to provide
direct evidence for the impact of data sharing on the scale of related studies. They
estimate that for the nearly 1,000 papers included in their analysis, the saved cost of de
novo data generation are between $893M to $1,707M.20
• Figures published by the National Research Council estimate a greater number of articles
are currently being published using preserved archival data from the Hubble Space**
Telescope than are published by new observations**.21
Enhances collaboration: Enables researchers to collaborate with each other by sharing data,
research environments, and tools.
• Open data enables recombination of data from heterogeneous sources spanning multiple
times and places to ask new questions.22
• When data are created, organized, described, and preserved using the same standards,
they become more interoperable and can be integrated into common tools.
• For example, by taking advantage of common neuroscience data formats, the
McGill Centre for Integrative Neuroscience is developing software tools and
platforms that are openly available for use by the Canadian and international
research community.
23
• A 2016 review of the open data made available by the European Bioinformatics
Institute estimates a direct efficiency impact of between £1 billion and £5 billion per
annum.
24
Increases visibility and impact of research: Data made discoverable and accessible through
a data repository can dramatically increase the impact of that research.
• Publishing research data has been associated with higher citation rates. For instance,
publications from clinical trials that shared underlying data were found to be cited up to
70% more frequently than those that did not.
25
• Initiatives like the Federated Research Data Repository (FRDR) drive increased visibility
of repositories and their data, and present a national snapshot of Canadian data assets.
26
Enables reproducibility of research results: When data are archived and shared, results can
be re-examined and data can be used for re-analysis, thereby improving reproducibility and
trustworthiness of published results.
• There are both tangible and intangible impacts of the current reproducibility crisis:
• Freedman et al. (2015) estimate that the total prevalence of irreproducible
biomedical research in the U.S. exceeds 50%, resulting in $28B annually spent on
preclinical research that is not reproducible.
27
• Meanwhile this lack of scientific reproducibility plays a significant role in reducing
public trust in science (G7, 2019).
28 Results of a 2019 Pew Research poll surveying
the American public’s trust in scientific experts highlighted open data as a factor