In a report by Nature, 52 percent of surveyed researchers expressed a belief that there is a significant crisis when it comes to reproducing research. From engineering to biology, there is at least some concern of whether or not a given study’s results can be reproduced and therefore utilized in another study. To overcome this challenge, computational scientists from five research universities, including the University of Notre Dame, are developing a cyberinfrastructure and supporting tools that allow researchers to conduct and track their work – including data and methodologies – in a reproducible way.
At the end of February, Notre Dame hosted about 20 computer and social scientists and software developers to work on the multi-university collaboration, called Whole Tale, at McCourtney Hall. The meeting not only provided an opportunity for the principal investigators to discuss the latest developments of the project, but also for the software developers from across each of the research institutions to discuss larger programming challenges of the project. One of these challenges includes finding a balance between the goals of the Whole Tale system – making research more easily reproducible – and ensuring the platform is user-friendly so it is usable for a large breadth of researchers.
“Although the vast majority of research published today is computationally-based, a gap has developed in how the scientific community shares and therefore reproduces research,” said Jarek Nabrzyski, director of the Center for Research Computing (CRC) at the University of Notre Dame and co-investigator on the project. “Whole Tale is designed as a platform to fill this gap, by providing an outlet where researchers can carry out and disseminate their research.”
The goal of Whole Tale is to enable the creation of “living publications” that integrate and link data, computations and scholarly articles. This way, once research is completed, another scientist or engineer could more effectively recreate the study using the same data sets, methods and anything else that is necessary. To create these living publications, researchers will work in the Whole Tale platform, an online environment that allows users to register, store data and compute data. More importantly though, scientists will also be able upload their research as a “tale.”
In explanation, Bertram Ludäscher, professor at the School of Information Sciences at the University of Illinois at Urbana-Champaign and lead principal investigator on the project, said, “A tale captures the whole story of a user’s research. This means that when someone else goes to view that research tale, they will be able to see the source data used in analyses and models as well as the resulting data and variations all within the working environment the research was conducted in.”
Once a tale is created, it could potentially be linked in a corresponding, published journal article. This insures that other researchers who would like to build off of or reproduce a study’s results have transparent access to essential information. Currently, scientists and researchers are unlikely to gain access to the data sets used in a study, let alone the methodologies and original working environment.
“Whole Tale could impact different facets of the research ecosystem, from what it means to be a good citizen scientist – responsibly tracking your work for the research community so that it can be reproduced – to how professional journals publish research and universities set standards for faculty hiring,” said Victoria Stodden, associate professor of information sciences at the University of Illinois Urbana-Champaign and co-investigator on the project.
The Whole Tale online platform is currently in alpha, or still being developed. Since the platform is intended to address real scientific cases, the project is currently allowing a number of science and cyberinfrastructure working groups to work in the platform. The dashboard will open to the research community at a later date. Until then, those interested can learn more about the project by visiting http://wholetale.org.
The Whole Tale team consists of researchers and programmers, who engage with the community through a number of working groups. Other co-investigators include Kyle Chard, fellow at the computational institute at the University of Chicago; Niall Gaffney, director of data intensive computing at the Texas Advanced Computing Center at the University of Texas at Austin; Matthew Jones, director of informatics research and development at the National Center for Ecological Analysis and Synthesis at the University of California, Santa Barbara; and Matthew Turk, assistant professor at the University of Illinois Urbana-Champaign.
Additional University of Notre Dame contributors include Ian Taylor, distributed computing and data science research professor, and two software developers at the CRC: Sebastian Wyngaard and Adam Brinckman.
Originally published by research.nd.edu on April 10.at