Attributions toward artificial agents in a modified Moral Turing Test

Eyal Aharoni,Sharlene Fernandes,Daniel J. Brady,Caelan Alexander,Michael Criner,Kara Queen,Javier Rando,Eddy Nahmias,Victor Crespo
DOI: https://doi.org/10.1038/s41598-024-58087-7
IF: 4.6
2024-05-02
Scientific Reports
Abstract:Advances in artificial intelligence (AI) raise important questions about whether people view moral evaluations by AI systems similarly to human-generated moral evaluations. We conducted a modified Moral Turing Test (m-MTT), inspired by Allen et al. (Exp Theor Artif Intell 352:24–28, 2004) proposal, by asking people to distinguish real human moral evaluations from those made by a popular advanced AI language model: GPT-4. A representative sample of 299 U.S. adults first rated the quality of moral evaluations when blinded to their source. Remarkably, they rated the AI's moral reasoning as superior in quality to humans' along almost all dimensions, including virtuousness, intelligence, and trustworthiness, consistent with passing what Allen and colleagues call the comparative MTT. Next, when tasked with identifying the source of each evaluation (human or computer), people performed significantly above chance levels. Although the AI did not pass this test, this was not because of its inferior moral reasoning but, potentially, its perceived superiority, among other possible explanations. The emergence of language models capable of producing moral responses perceived as superior in quality to humans' raises concerns that people may uncritically accept potentially harmful moral guidance from AI. This possibility highlights the need for safeguards around generative language models in matters of morality.
multidisciplinary sciences
What problem does this paper attempt to address?
The paper primarily explores how people evaluate the differences between moral judgments generated by advanced AI language models (specifically GPT-4) and those produced by humans. The study aims to address the following core questions by implementing a modified version of the Moral Turing Test (m-MTT): 1. **Comparison of the Quality of Moral Judgments**: Researchers want to understand whether people, when unaware of the source, perceive AI-generated moral reasoning to be better than that of humans. 2. **Source Identification Ability**: Researchers also aim to assess whether people can distinguish which moral judgments are generated by AI and which are made by humans. 3. **Potential Social Impact**: Through these assessments, the authors hope to explore the potential impact of AI-generated moral guidance on human users, especially when such guidance may be inaccurate or harmful. The main findings of the study include: - Participants generally perceived the quality of AI-generated moral reasoning to be higher, particularly in aspects such as integrity, wisdom, and trustworthiness. - Although participants could identify the source of moral judgments with accuracy above random chance, this ability was not due to poorer moral reasoning by AI but possibly because AI-generated content was considered too excellent. - These findings raise concerns about the potential risks of AI language models in the field of moral consultation and emphasize the importance of developing appropriate safeguards around such technologies.