Human or Not? A Gamified Approach to the Turing Test

Daniel Jannai,Amos Meron,Barak Lenz,Yoav Levine,Yoav Shoham
2023-06-01
Abstract:We present "Human or Not?", an online game inspired by the Turing test, that measures the capability of AI chatbots to mimic humans in dialog, and of humans to tell bots from other humans. Over the course of a month, the game was played by over 1.5 million users who engaged in anonymous two-minute chat sessions with either another human or an AI language model which was prompted to behave like humans. The task of the players was to correctly guess whether they spoke to a person or to an AI. This largest scale Turing-style test conducted to date revealed some interesting facts. For example, overall users guessed the identity of their partners correctly in only 68% of the games. In the subset of the games in which users faced an AI bot, users had even lower correct guess rates of 60% (that is, not much higher than chance). This white paper details the development, deployment, and results of this unique experiment. While this experiment calls for many extensions and refinements, these findings already begin to shed light on the inevitable near future which will commingle humans and AI.
Artificial Intelligence,Computation and Language,Computers and Society,Human-Computer Interaction
What problem does this paper attempt to address?
The problem this paper attempts to address is the evaluation of modern AI chatbots' ability to mimic human conversation and the ease with which humans can recognize these AI chatbots. Specifically, the paper introduces a gamified experiment called "Human or Not?" In this game, participants engage in an anonymous two-minute chat with either another human or an AI language model and then guess whether the other party is human or AI. This experiment not only showcases the current advancements in AI's ability to mimic human conversation but also reveals human capability in recognizing AI. The research findings indicate that after a brief interaction, users correctly identify AI with a probability of only 68%, and this probability further drops to 60% when facing AI. These findings help us understand the future societal context of human-AI coexistence and provide valuable insights for the development of AI language models.