Abstract:In this chapter, we propose a non-traditional RCR training in data science that is grounded into a virtue theory framework. First, we delineate the approach in more theoretical detail, by discussing how the goal of RCR training is to foster the cultivation of certain moral abilities. We specify the nature of these abilities: while the ideal is the cultivation of virtues, the limited space allowed by RCR modules can only facilitate the cultivation of superficial abilities or proto-virtues, which help students to familiarize with moral and political issues in the data science environment. Third, we operationalize our approach by stressing that (proto-)virtue acquisition (like skill acquisition) occurs through the technical and social tasks of daily data science activities, where these repetitive tasks provide the opportunities to develop (proto-)virtue capacity and to support the development of ethically robust data systems. Finally, we discuss a concrete example of how this approach has been implemented. In particular, we describe how this method is applied to teach data ethics to students participating in the CODATA-RDA Data Science Summer Schools.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the deficiencies in current Research Integrity (RCR) training in the field of data science. Specifically, the author believes that traditional RCR training has the following problems: 1. **Limitations of Traditional RCR Training**: Traditional RCR training often focuses on rule - following and compliance, causing researchers to view it as a "task that must be completed" rather than truly understanding the ethical and moral significance behind it. This is contrary to the original motivation of RCR training, which is to promote research integrity by changing behavior, character, and judgment. 2. **Unique Challenges in Data Science**: As a new and interdisciplinary field, data science brings many new ethical, political, and social issues. These issues involve not only the creation, analysis, and reuse of data but also algorithmic bias and its potential social impacts. Therefore, an ethical training specifically for data scientists is needed to help them deal with these unique challenges. 3. **Inadaptability of Existing RCR Training Models**: Directly applying existing RCR training models to data science may overlook the technical and repetitive tasks specific to the daily work of data scientists. This inadaptability may lead to poor training results and fail to truly improve the ethical awareness of data scientists. To solve these problems, the author proposes a new RCR training method based on the framework of virtue ethics. This method aims to cultivate the moral capabilities (or "quasi - virtues") of data scientists through daily data science activities, enabling them to gradually internalize ethical norms in practice. In addition, the author also emphasizes the importance of "micro - ethics", that is, paying attention to the ethical nuances in daily decision - making and making it a habit through repeated practice. To specifically illustrate the application of this new method, the author cites their teaching experience in the CODATA - RDA Data Science Research School as a case study. This project has successfully integrated ethical thinking into the daily practice of data scientists by embedding ethical training into specific technical courses.

Character comes from practice: longitudinal practice-based ethics training in data science

Integration of RCR and Ethics Education into Course-Based Undergraduate Research Experiences in the Biological Sciences: A Needed Discussion

Teaching Virtue: Pedagogical Implications of Moral Psychology

Reframing data ethics in research methods education: a pathway to critical data literacy

Rethinking the Responsible Conduct of Research (RCR) Course

Improving moral reasoning among college students: a game-based learning approach

Identification of critical contaminants in wastewater effluent for managed aquifer recharge.

Training responsible engineers. Phronesis and the role of virtues in teaching engineering ethics

Beyond case studies: Teaching data science critique and ethics through sociotechnical surveillance studies

Don't "research fast and break things": On the ethics of Computational Social Science

Training Ethically Responsible AI Researchers: a Case Study

Data Science as Political Action: Grounding Data Science in a Politics of Justice

Co-constructing Shared Values and Ethical Practice for the Next Generation: Lessons Learned from a Curriculum on Information Ethics

The Effect of Value-Focused Discussions on Scientists' Ethical Decision Making

Critical companionship: Some sensibilities for studying the lived experience of data subjects

Erratum to: Parenting Programs for the Prevention of Child Physical Abuse Recurrence: A Systematic Review and Meta-Analysis

Mapping for accessibility: A case study of ethics in data science for social good

Keeping the human in the data scientist: Shaping human‐centered data science education

Being a teacher-researcher: reflections on an insider research project from a virtues-based approach to research ethics

Researchers Training Researchers: Ethics Training in Quantitative Applied Linguistics

Transdisciplinary participatory action research: how philosophers, psychologists, and practitioners can work (Well) together to promote adolescent character development within context