Abstract:We explore the self-play training procedure of large language models (LLMs) in a two-player adversarial language game called Adversarial Taboo. In this game, an attacker and a defender communicate around a target word only visible to the attacker. The attacker aims to induce the defender to speak the target word unconsciously, while the defender tries to infer the target word from the attacker's utterances. To win the game, both players should have sufficient knowledge about the target word and high-level reasoning ability to infer and express in this information-reserved conversation. Hence, we are curious about whether LLMs' reasoning ability can be further enhanced by self-play in this adversarial language game (SPAG). With this goal, we select several open-source LLMs and let each act as the attacker and play with a copy of itself as the defender on an extensive range of target words. Through reinforcement learning on the game outcomes, we observe that the LLMs' performances uniformly improve on a broad range of reasoning benchmarks. Furthermore, iteratively adopting this self-play process can continuously promote LLMs' reasoning abilities. The code is at

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the insufficient reasoning ability of large - language models (LLMs) in complex problem - solving and the development of high - level intelligence. Although existing large - language models such as GPT - 4 and Gemini perform excellently in natural - language understanding, text generation, machine translation, and programming, they still face challenges in reasoning ability, especially in terms of correctness and faithfulness. To address this challenge, researchers have tried various methods, such as Chain - of - Thought (CoT) prompt engineering and the use of auxiliary reasoning tools, but these methods often require additional prompt design and are sensitive to different prompt patterns and LLM checkpoints. To solve these problems, this paper proposes a new self - training method to enhance the reasoning ability of LLMs by allowing them to play against themselves in an adversarial language game. Specifically, the method selects a language game named "Adversarial Taboo", in which two players (an attacker and a defender) have a conversation around a target word known only to the attacker. The attacker's aim is to induce the defender to unconsciously say the target word, while the defender tries to infer the target word from the conversation history. To win the game, both sides need to have sufficient knowledge about the target word and a high level of reasoning ability. Through this method, researchers hope to explore whether the reasoning ability of LLMs can be further enhanced through Self - Play in Adversarial language Game (SPAG). The experimental results show that after multiple rounds of self - play training, the performance of LLMs on multiple reasoning benchmark tests has significantly improved, especially in logical reasoning, common - sense understanding, and complex problem - solving. In addition, iteratively adopting this self - play process can continuously promote the reasoning ability of LLMs, thus providing new possibilities for developing more advanced LLM capabilities.

Self-playing Adversarial Language Game Enhances LLM Reasoning

Adversarial Language Games for Advanced Natural Language Intelligence

Enhance Reasoning for Large Language Models in the Game Werewolf

Can Large Language Models Play Games? A Case Study of A Self-Play Approach

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations

Large Language Model Sentinel: Advancing Adversarial Robustness by LLM Agent

Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games

LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback

Can LLMs Reason in the Wild with Programs?

Large Language Models Can Self-Improve in Long-context Reasoning

Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game

Large Language Models as Agents in Two-Player Games

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked

Large Language Model Sentinel: LLM Agent for Adversarial Purification

Strategic Reasoning with Language Models

From Text to Tactic: Evaluating LLMs Playing the Game of Avalon

Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

Explore the Reasoning Capability of LLMs in the Chess Testbed

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models