Character comes from practice: longitudinal practice-based ethics training in data science

Louise Bezuidenhout,Emanuele Ratti
2024-01-09
Abstract:In this chapter, we propose a non-traditional RCR training in data science that is grounded into a virtue theory framework. First, we delineate the approach in more theoretical detail, by discussing how the goal of RCR training is to foster the cultivation of certain moral abilities. We specify the nature of these abilities: while the ideal is the cultivation of virtues, the limited space allowed by RCR modules can only facilitate the cultivation of superficial abilities or proto-virtues, which help students to familiarize with moral and political issues in the data science environment. Third, we operationalize our approach by stressing that (proto-)virtue acquisition (like skill acquisition) occurs through the technical and social tasks of daily data science activities, where these repetitive tasks provide the opportunities to develop (proto-)virtue capacity and to support the development of ethically robust data systems. Finally, we discuss a concrete example of how this approach has been implemented. In particular, we describe how this method is applied to teach data ethics to students participating in the CODATA-RDA Data Science Summer Schools.
Computers and Society
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the deficiencies in current Research Integrity (RCR) training in the field of data science. Specifically, the author believes that traditional RCR training has the following problems: 1. **Limitations of Traditional RCR Training**: Traditional RCR training often focuses on rule - following and compliance, causing researchers to view it as a "task that must be completed" rather than truly understanding the ethical and moral significance behind it. This is contrary to the original motivation of RCR training, which is to promote research integrity by changing behavior, character, and judgment. 2. **Unique Challenges in Data Science**: As a new and interdisciplinary field, data science brings many new ethical, political, and social issues. These issues involve not only the creation, analysis, and reuse of data but also algorithmic bias and its potential social impacts. Therefore, an ethical training specifically for data scientists is needed to help them deal with these unique challenges. 3. **Inadaptability of Existing RCR Training Models**: Directly applying existing RCR training models to data science may overlook the technical and repetitive tasks specific to the daily work of data scientists. This inadaptability may lead to poor training results and fail to truly improve the ethical awareness of data scientists. To solve these problems, the author proposes a new RCR training method based on the framework of virtue ethics. This method aims to cultivate the moral capabilities (or "quasi - virtues") of data scientists through daily data science activities, enabling them to gradually internalize ethical norms in practice. In addition, the author also emphasizes the importance of "micro - ethics", that is, paying attention to the ethical nuances in daily decision - making and making it a habit through repeated practice. To specifically illustrate the application of this new method, the author cites their teaching experience in the CODATA - RDA Data Science Research School as a case study. This project has successfully integrated ethical thinking into the daily practice of data scientists by embedding ethical training into specific technical courses.