Attributions toward Artificial Agents in a modified Moral Turing Test

Eyal Aharoni,Sharlene Fernandes,Daniel J. Brady,Caelan Alexander,Michael Criner,Kara Queen,Javier Rando,Eddy Nahmias,Victor Crespo
2024-04-03
Abstract:Advances in artificial intelligence (AI) raise important questions about whether people view moral evaluations by AI systems similarly to human-generated moral evaluations. We conducted a modified Moral Turing Test (m-MTT), inspired by Allen and colleagues' (2000) proposal, by asking people to distinguish real human moral evaluations from those made by a popular advanced AI language model: GPT-4. A representative sample of 299 U.S. adults first rated the quality of moral evaluations when blinded to their source. Remarkably, they rated the AI's moral reasoning as superior in quality to humans' along almost all dimensions, including virtuousness, intelligence, and trustworthiness, consistent with passing what Allen and colleagues call the comparative MTT. Next, when tasked with identifying the source of each evaluation (human or computer), people performed significantly above chance levels. Although the AI did not pass this test, this was not because of its inferior moral reasoning but, potentially, its perceived superiority, among other possible explanations. The emergence of language models capable of producing moral responses perceived as superior in quality to humans' raises concerns that people may uncritically accept potentially harmful moral guidance from AI. This possibility highlights the need for safeguards around generative language models in matters of morality.
Computers and Society,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **Do people think that moral evaluations generated by AI systems are similar to, or even better than, those generated by humans?** Specifically, the researchers explored the following questions through the modified Moral Turing Test (m - MTT): 1. **Will people consider AI - generated moral evaluations to be of higher quality than human - generated ones?** - The research hypothesis was that, without knowing the source, participants would think that AI - generated moral evaluations were superior to human - generated ones in multiple dimensions (such as virtue, intelligence, credibility, etc.). 2. **Can people distinguish between AI - and human - generated moral evaluations?** - The research hypothesis was that, when informed that one of them was AI - generated, participants could not significantly better than random chance identify which was AI - generated. ### Main Research Background With the progress of artificial intelligence (AI) technology, especially the emergence of large - language models (LLMs) such as GPT - 4, the question of whether AI can perform human - like moral reasoning has become particularly important. These questions not only involve the fields of ethics and philosophy but also have a profound impact on social and technological applications. ### Research Methods The researchers designed a modified version of the Moral Turing Test (m - MTT), which consisted of two main parts: 1. **Comparative Evaluation (cMTT):** - Participants were first required to rate the quality of a series of moral evaluations without knowing their sources. These evaluations were from humans and AI respectively. 2. **Source Identification (MTT):** - Next, participants were informed that some of the evaluations were generated by AI and were asked to try to distinguish which were AI - generated and which were human - generated. ### Research Results - **In terms of quality evaluation**: - The results showed that, in general, participants thought that AI - generated moral evaluations were superior to human - generated ones in multiple dimensions. For example, AI evaluations were considered to be more intelligent, rational, and credible, etc. - **In terms of source identification**: - Although participants performed better than random chance when distinguishing between AI - and human - generated evaluations, this was not because of AI's poor moral reasoning, but possibly because it was considered too excellent. ### Significance and Impact This study revealed people's views and attitudes towards AI - generated moral evaluations and presented the following important points: 1. **Potential Risks**: - If people uncritically accept moral guidance provided by AI, potential harm may occur. Therefore, strict safety guarantees and supervision are required for the application of generative language models in the moral field. 2. **Future Development Directions**: - The research results provide developers and users with a reference for understanding the current LLMs' moral language capabilities, and also provide scholars with new insights into the characteristics of ordinary human moral intelligence. In conclusion, this paper experimentally verified the potential and limitations of AI in moral evaluation and emphasized the need to be cautious about AI - generated content in practical applications.