Abstract:In recent years, transformer-based language representation models (LRMs) have achieved state-of-the-art results on difficult natural language understanding problems, such as question answering and text summarization. As these models are integrated into real-world applications, evaluating their ability to make rational decisions is an important research agenda, with practical ramifications. This article investigates LRMs' rational decision-making ability through a carefully designed set of decision-making benchmarks and experiments. Inspired by classic work in cognitive science, we model the decision-making problem as a bet. We then investigate an LRM's ability to choose outcomes that have optimal, or at minimum, positive expected gain. Through a robust body of experiments on four established LRMs, we show that a model is only able to `think in bets' if it is first fine-tuned on bet questions with an identical structure. Modifying the bet question's structure, while still retaining its fundamental characteristics, decreases an LRM's performance by more than 25\%, on average, although absolute performance remains well above random. LRMs are also found to be more rational when selecting outcomes with non-negative expected gain, rather than optimal or strictly positive expected gain. Our results suggest that LRMs could potentially be applied to tasks that rely on cognitive decision-making skills, but that more research is necessary before they can robustly make rational decisions.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to explore whether Transformer - based Language Representation Models (LRMs) have the ability of rational decision - making, especially when facing decisions with uncertain outcomes. Specifically, through a series of carefully designed decision - making experiments and benchmark tests, the researchers evaluate the ability of these models in choosing the optimal or at least positive - expected - return outcomes. #### Research Background In recent years, Transformer - based Language Representation Models (such as BERT, RoBERTa, DeBERTa, BigBird, etc.) have made remarkable progress in natural - language - understanding tasks, such as question - answering systems and text summarization. As these models are gradually applied to practical scenarios, evaluating their rational decision - making ability has become an important research direction with practical significance. #### Research Questions The paper proposes three specific research questions (RQs) to explore the preferences and decision - making abilities of Language Representation Models: 1. **RQ1 (Preference Elicitation)**: Can Language Representation Models be trained to prefer high - value items (such as diamonds) rather than low - value items (such as plastic pens), where "value" refers to the value in the sense of common - sense economics? 2. **RQ2 (Bet - thinking without Specific - task Fine - tuning)**: Can Language Representation Models rationally choose outcomes with higher expected returns without being fine - tuned for specific bet - related problems? 3. **RQ3 (Bet - thinking after Specific - task Fine - tuning)**: Can Language Representation Models more effectively choose outcomes with higher expected returns after being fine - tuned for specific bet - related problems? #### Experimental Design To answer these questions, the researchers constructed a new set of decision - making and preference - elicitation benchmarks and designed detailed experimental methods. The experiments include extensive testing on four established Language Representation Models (BERT, RoBERTa, DeBERTa, BigBird) to evaluate their performance under different conditions. #### Main Findings - Only after being fine - tuned with the same structure as the bet - related problems can the models "think about bets". Changing the structure of the bet - related problems will significantly reduce the performance of the models, although the absolute performance is still much higher than the random level. - The models show more rational behavior when choosing non - negative - expected - return outcomes rather than strictly positive or optimal - expected - return outcomes. - The results indicate that Language Representation Models can potentially be applied to tasks relying on cognitive decision - making skills, but further research is required before they can robustly make rational decisions. In conclusion, this paper reveals the abilities and limitations of Language Representation Models in rational decision - making through empirical research, providing valuable references for future research.

Can Language Representation Models Think in Bets?

Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games

Large Language Models Assume People are More Rational than We Really are

Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis

STEER: Assessing the Economic Rationality of Large Language Models

What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context

Building Decision Making Models Through Language Model Regime

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning

Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models

Large Language Models are Biased Reinforcement Learners

Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference

GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents

The Emergence of Strategic Reasoning of Large Language Models

Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning

On the Modeling Capabilities of Large Language Models for Sequential Decision Making

(Ir)rationality and Cognitive Biases in Large Language Models

Rationality Report Cards: Assessing the Economic Rationality of Large Language Models