Approachable Case Studies Support Learning and Reproducibility in Data Science: An Example from Evolutionary Biology

Luna L. Sanchez Reyes,Emily Jane McTavish
DOI: https://doi.org/10.1080/26939169.2022.2099487
2022-09-02
Journal of Statistics and Data Science Education
Abstract:Research reproducibility is essential for scientific development. Yet, rates of reproducibility are low. As increasingly more research relies on computers and software, efforts for improving reproducibility rates have focused on making research products digitally available, such as publishing analysis workflows as computer code, and raw and processed data in computer readable form. However, research products that are digitally available are not necessarily friendly for learners and interested parties with little to no experience in the field. This renders research products unapproachable, counteracts their availability, and hinders scientific reproducibility. To improve both short- and long-term adoption of reproducible scientific practices, research products need to be made approachable for learners, the researchers of the future. Using a case study within evolutionary biology, we identify aspects of research workflows that make them unapproachable to the general audience: use of highly specialized language; unclear goals and high cognitive load; and lack of trouble-shooting examples. We propose principles to improve the unapproachable aspects of research workflows and illustrate their application using an online teaching resource. We elaborate on the general application of these principles for documenting research products and teaching materials, to provide present learners and future researchers with tools for successful scientific reproducibility. Supplementary materials for this article are available online.
What problem does this paper attempt to address?