Abstract:Do large language models (LLMs) solve reasoning tasks by learning robust generalizable algorithms, or do they memorize training data? To investigate this question, we use arithmetic reasoning as a representative task. Using causal analysis, we identify a subset of the model (a circuit) that explains most of the model's behavior for basic arithmetic logic and examine its functionality. By zooming in on the level of individual circuit neurons, we discover a sparse set of important neurons that implement simple heuristics. Each heuristic identifies a numerical input pattern and outputs corresponding answers. We hypothesize that the combination of these heuristic neurons is the mechanism used to produce correct arithmetic answers. To test this, we categorize each neuron into several heuristic types-such as neurons that activate when an operand falls within a certain range-and find that the unordered combination of these heuristic types is the mechanism that explains most of the model's accuracy on arithmetic prompts. Finally, we demonstrate that this mechanism appears as the main source of arithmetic accuracy early in training. Overall, our experimental results across several LLMs show that LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a "bag of heuristics".

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper explores whether large language models (LLMs) solve reasoning tasks by learning robust, generalizable algorithms or merely by memorizing training data. Specifically, the authors choose arithmetic reasoning as a representative task, using causal analysis to identify a small subset of components (referred to as "circuits") responsible for basic arithmetic logic within the model and delve into the functions of these components. ### Main Research Content 1. **Research Background**: - Arithmetic reasoning is a task that can be solved using various methods, including learning known algorithms, developing new methods, or memorizing a large number of input-output pairs. - The core question posed by the authors is: Do LLMs correctly perform arithmetic tasks through robust algorithms, similar to how children learn column addition, or do they merely memorize a large number of arithmetic prompts? 2. **Research Methods**: - **Circuit Discovery**: The authors use activation patching experiments to identify and evaluate the circuit components within the model responsible for arithmetic calculations. - **Neuron Analysis**: By analyzing individual neurons within the circuits, a set of sparse, important neurons is discovered, each implementing a simple heuristic rule. - **Heuristic Classification**: Each neuron is classified into one of several heuristic types, such as neurons that activate when operands fall within a specific range. 3. **Main Findings**: - **Heuristic Mechanism**: LLMs do not use robust algorithms nor do they fully rely on memorization; instead, they complete arithmetic tasks through a set of simple heuristic rules (referred to as "heuristic bundles"). - **Heuristic Combination**: Successful prompt completion relies on the combination of multiple unrelated heuristic types. - **Evolution During Training**: Heuristic mechanisms appear early in training and gradually converge to the state found in the final model. 4. **Experimental Results**: - Experiments on multiple LLMs reveal that heuristic mechanisms explain most of the model's behavior on arithmetic tasks. - Ablation experiments confirm the causal importance of heuristic neurons in promoting the probability of correct answers. 5. **Limitations**: - The heuristic mechanism is not perfect and sometimes fails due to insufficient heuristic rules or low recall rates. ### Conclusion Overall, this paper, through detailed experiments and analysis, reveals that LLMs use a "heuristic bundle" mechanism rather than robust algorithms or complete memorization to solve arithmetic tasks. This finding helps to better understand the capabilities and limitations of LLMs in arithmetic reasoning and may have implications for other reasoning tasks as well.

Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures

Language Models are Symbolic Learners in Arithmetic

Interpreting and Improving Large Language Models in Arithmetic Calculation

Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks

Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Arithmetic Reasoning with LLM: Prolog Generation & Permutation

Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From A Psychological Perspective

Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology

Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis

When Do Program-of-Thought Works for Reasoning?

How well do Large Language Models perform in Arithmetic tasks?

Small Language Models are Equation Reasoners

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

Self-training Language Models for Arithmetic Reasoning

Arithmetic with Language Models: from Memorization to Computation

A Careful Examination of Large Language Model Performance on Grade School Arithmetic