Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis

Caoyun Fan,Jindou Chen,Yaohui Jin,Hao He
DOI: https://doi.org/10.48550/arXiv.2312.05488
2023-12-13
Abstract:Game theory, as an analytical tool, is frequently utilized to analyze human behavior in social science research. With the high alignment between the behavior of Large Language Models (LLMs) and humans, a promising research direction is to employ LLMs as substitutes for humans in game experiments, enabling social science research. However, despite numerous empirical researches on the combination of LLMs and game theory, the capability boundaries of LLMs in game theory remain unclear. In this research, we endeavor to systematically analyze LLMs in the context of game theory. Specifically, rationality, as the fundamental principle of game theory, serves as the metric for evaluating players' behavior -- building a clear desire, refining belief about uncertainty, and taking optimal actions. Accordingly, we select three classical games (dictator game, Rock-Paper-Scissors, and ring-network game) to analyze to what extent LLMs can achieve rationality in these three aspects. The experimental results indicate that even the current state-of-the-art LLM (GPT-4) exhibits substantial disparities compared to humans in game theory. For instance, LLMs struggle to build desires based on uncommon preferences, fail to refine belief from many simple patterns, and may overlook or modify refined belief when taking actions. Therefore, we consider that introducing LLMs into game experiments in the field of social science should be approached with greater caution.
Artificial Intelligence,Computation and Language,Computer Science and Game Theory
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to systematically analyze the capability boundaries of large language models (LLMs) in game theory. Specifically, the paper evaluates whether LLMs can achieve the three characteristics of rational players through three classic games (Dictator Game, Rock-Paper-Scissors Game, Ring Network Game): 1. **Constructing Clear Desires**: Establishing specific opinions on each outcome in the game based on preferences. 2. **Refining Beliefs about Uncertainty**: Extracting the probability distribution of opponents' behaviors from game information. 3. **Taking Optimal Actions**: Choosing the best action based on desires and beliefs. ### Research Background Game theory, as a mathematical tool for analyzing human behavior, is widely used in social sciences (such as economics, psychology, sociology, etc.). With the development of large language models, their high consistency with human behavior has led researchers to consider using LLMs as substitutes for humans in social science research. However, despite many empirical studies combining LLMs and game theory, the capability boundaries of LLMs in game theory remain unclear. ### Research Methods 1. **Dictator Game**: Used to evaluate whether LLMs can construct clear desires based on different preferences. Experimental results show that LLMs perform well under common preferences but poorly under uncommon preferences. 2. **Rock-Paper-Scissors Game**: Used to evaluate whether LLMs can refine beliefs from simple patterns. Experimental results show that even the most advanced GPT-4 finds it difficult to refine beliefs from many simple patterns. 3. **Ring Network Game**: Used to evaluate whether LLMs can take optimal actions given certain beliefs. Experimental results show that LLMs can improve their ability to take optimal actions in some cases but still tend to ignore or modify already refined beliefs. ### Main Findings 1. **Constructing Clear Desires**: LLMs can construct clear desires under common preferences but perform poorly under uncommon preferences. 2. **Refining Beliefs about Uncertainty**: LLMs find it difficult to refine beliefs from many simple patterns, especially in game experiments that require handling complex beliefs. 3. **Taking Optimal Actions**: LLMs can improve their ability to take optimal actions in some cases but still tend to ignore or modify already refined beliefs. ### Conclusion The paper systematically explores the capability boundaries of LLMs in game theory and points out that caution should be exercised when introducing LLMs into social science research. Although GPT-4 performs well in some aspects, overall, LLMs still have significant shortcomings in handling complex beliefs and taking optimal actions.