Examining Racial Stereotypes in YouTube Autocomplete Suggestions

Eunbin Ha,Haein Kong,Shagun Jhaver
2024-10-04
Abstract:Autocomplete is a popular search feature that predicts queries based on user input and guides users to a set of potentially relevant suggestions. In this study, we examine how YouTube autocompletes serve as an information source for users exploring information about race. We perform an algorithm output audit of autocomplete suggestions for input queries about four racial groups and examine the stereotypes they embody. Using critical discourse analysis, we identify five major sociocultural contexts in which racial biases manifest -- Appearance, Ability, Culture, Social Equity, and Manner. Our results show evidence of aggregated discrimination and interracial tensions in the autocompletes we collected and highlight their potential risks in othering racial minorities. We call for urgent innovations in content moderation policy design and enforcement to address these biases in search outputs.
Computers and Society
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How do YouTube's autocomplete features reflect and spread racial stereotypes and their potential sociocultural impacts when users search for information about races?** Specifically, by examining algorithmic outputs, the author explored whether and how YouTube's autocomplete suggestions reflect racial stereotypes about four major racial groups (whites, blacks, Asians, and Hispanics). The main research questions include: 1. **How does the information provided by YouTube shape users' understanding of races?** 2. **Do these autocomplete suggestions spread negative racial stereotypes?** 3. **How can the impact of these biases be reduced by improving content moderation policies?** ### Research Background and Motivation With the popularization of autocomplete features, this technology not only improves users' search efficiency but may also inadvertently lead users to be exposed to specific racial stereotypes. These stereotypes can be gradually internalized through users' repeated exposure, which in turn affects their long - term beliefs and behaviors. Therefore, researchers have noticed that autocomplete features may inadvertently reinforce the existing racial biases in society and hope to reveal these problems through research, thereby promoting better algorithm design and content moderation policies. ### Methods and Results The author conducted the research through the following steps: - **Constructing Input Queries**: Four racial categories (whites, blacks, Asians, and Hispanics) were selected, and a series of query statements based on these categories were constructed. - **Data Collection**: A Python script was used to automate the collection of YouTube's autocomplete suggestions, ensuring that each query was independent and avoiding the influence of personalization and browsing history. - **Data Analysis**: Inductive analysis and critical discourse analysis (CDA) methods were used to classify and interpret the collected data. The research results show that YouTube's autocomplete features reflect racial biases in five main aspects: appearance, ability, culture, social equity, and behavior. For example, in terms of appearance, only autocomplete related to blacks involves personal hygiene issues; in terms of ability, the autocomplete shows stereotypes about the musical, dancing, and sports talents of different races; in terms of culture, the autocomplete reflects the phenomenon of cultural appropriation. ### Conclusions and Recommendations The author calls for strengthening the content moderation of autocomplete features to reduce the risk of spreading racial biases. At the same time, the research also emphasizes the important role of social media platforms in information dissemination and calls for more research to focus on algorithmic bias issues on these platforms. Through this research, the author hopes to draw the attention of the academic community and society to algorithmic biases and promote the development of more fair and transparent algorithm design and content moderation mechanisms.