A comparison of online search engine autocompletion in Google and Baidu

Geng Liu,Pietro Pinoli,Stefano Ceri,Francesco Pierri
2024-05-03
Abstract:Warning: This paper contains content that may be offensive or upsetting. Online search engine auto-completions make it faster for users to search and access information. However, they also have the potential to reinforce and promote stereotypes and negative opinions about a variety of social groups. We study the characteristics of search auto-completions in two different linguistic and cultural contexts: Baidu and Google. We find differences between the two search engines in the way they suppress or modify original queries, and we highlight a concerning presence of negative suggestions across all social groups. Our study highlights the need for more refined, culturally sensitive moderation strategies in current language technologies.
Computers and Society
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how the autocomplete function of online search engines affects the spread of stereotypes and negative views of social groups in different languages and cultural backgrounds. Specifically, the author studied the different ways in which Baidu (the main search engine in China) and Google (a widely - used search engine in the West) handle user queries, and how these differences are reflected in the autocomplete suggestions for various social groups. ### Main problems: 1. **Influence of the autocomplete function**: Although the autocomplete function improves the speed at which users search for information, it may also strengthen and spread stereotypes and negative views about different social groups. 2. **Differences in cross - cultural and language backgrounds**: The study compared the ways in which Baidu and Google handle autocomplete content in Chinese and English environments, and explored their different strategies in suppressing or modifying original queries. 3. **Existence of negative suggestions**: The study found that there are worrying negative suggestions among all social groups, indicating that current language technologies require more refined and culturally - sensitive regulation strategies. ### Research questions: - **RQ1**: How different are Baidu and Google in terms of suppressing or modifying autocomplete content? - **RQ2**: How different are the sentiment tendencies of Baidu and Google's autocomplete content? ### Methods: - **Data collection**: More than 2,000 autocomplete results were collected from Baidu and Google, covering 146 unique social groups in eight categories, including age, gender, lifestyle, nationality, population, political tendency, religion, and sexual orientation. - **Sentiment analysis**: The GPT - 4 model was used to score the sentiment of the autocomplete content to evaluate its positive, neutral, or negative sentiment tendencies. ### Results: - **Proportion of unresponsive queries**: Google showed a higher proportion of unresponsive queries in all categories, especially in the sexual orientation category, where Google did not provide any autocomplete suggestions at all. - **Proportion of inconsistent results**: Both Baidu and Google had a large number of inconsistent autocomplete results, especially in the political tendency, religion, and sexual orientation categories. - **Sentiment analysis**: Whether it was Baidu or Google, more than 50% of the autocomplete content was negative, especially more obvious in the age and gender categories. ### Conclusion: This study reveals significant differences between Baidu and Google in handling autocomplete content, especially in suppressing negative content and maintaining cultural sensitivity. The study calls on language technology providers to adopt more transparent and effective regulation measures to reduce the spread of stereotypes and negative views.