Characterizing Data Practices in Research Papers Across Four Disciplines.

Sanwoo Lee,Wenqi Li,Pengyi Zhang,Jun Wang
DOI: https://doi.org/10.1007/978-3-031-28035-1_26
2023-01-01
Abstract:Research Data Practices (RDP) refer to research activities conducted across the lifespan of data. Characterizing RDP in disciplinary contexts is beneficial for providing data stakeholders with practical understanding of RDP necessary to design data curation services which are tailored to researchers’ need. In this paper, we focus on the five most common types of RDP – collecting data, processing data, analyzing data, representing data, and publishing or citing data. First, we compared the distributions of the five types of RDP across disciplines and observed noticeable differences between disciplines. In addition, we examined the characteristics of each type of RDP under different disciplinary contexts, by developing discipline-specific RDP vocabulary employing the tf-idf approach. Based on the common terms as well as the discipline-specific ones, we found that the five types of RDP can be distinctly conceptualized, while each type of RDP varies by disciplines in terms of their action, object, and instrument.
What problem does this paper attempt to address?