Whose AI Dream? In search of the aspiration in data annotation

Ding Wang,Shantanu Prabhat,Nithya Sambasivan
DOI: https://doi.org/10.48550/arXiv.2203.10748
2022-03-21
Abstract:This paper present the practice of data annotation from the perspective of the annotators. Data is fundamental to ML models. This paper investigates the work practices concerning data annotation as performed in the industry, in India. Previous investigations have largely focused on annotator subjectivity, bias and efficiency. We present a wider perspective of the data annotation, following a grounded approach, we conducted three sets of interviews with 25 annotators, 10 industry experts and 12 ML practitioners. Our results show that the work of annotators is dictated by the interests, priorities and values of others above their station. More than technical, we contend that data annotation is a systematic exercise of power through organizational structure and practice. We propose a set of implications for how we can cultivate and encourage better practice to balance the tension between the need for high quality data at low cost and the annotator aspiration for well being, career perspective, and active participation in building the AI dream.
Human-Computer Interaction,Computers and Society
What problem does this paper attempt to address?
This paper attempts to explore the practice of data annotation work in the Indian industry, especially focusing on the working conditions of data annotators and the power structure behind them. Specifically, through in - depth interviews with 25 data annotators, 10 industry experts and 12 machine learning/AI engineers, the paper reveals the problems existing in data annotation work, including: 1. **Working Conditions of Data Annotators**: Although shifting from crowdsourcing platforms to full - time employment, the working conditions of data annotators are still problematic. For example, the pursuit of high - quality work output has led to multi - level reviews, bringing huge work pressure to annotators. In addition, the jobs of annotators are still unstable and there are limited career development opportunities. 2. **Limitations of Career Development**: There are breaks in the career paths of data annotators, and they are anxious about employment and performance. Although formal employment brings some benefits (such as pensions and insurance), there is a lack of opportunities for career growth, and the control of the organizational structure exacerbates these problems, and unpaid overtime has become the norm. 3. **Impact of Power Structure**: The paper points out that data annotation is not only a technical task, but also a process of systematically exercising power through organizational structures and practices. The work of annotators is influenced by the interests, priorities and values of other stakeholders (such as company management, clients). 4. **Ethical Issues in the Industry**: The paper also discusses how to balance the cost with the well - being, career prospects and active participation in building the AI dream of annotators while ensuring high - quality data. Some suggestions for improving practice are put forward, aiming to promote a more fair and sustainable data annotation industry. Overall, this paper aims to reveal the problems existing in the data annotation industry through empirical research and put forward improvement suggestions in order to achieve more human - centered and sustainable development.