Abstract:Chain-of-thought (CoT) has emerged as a powerful technique to elicit reasoning in large language models and improve a variety of downstream tasks. CoT mainly demonstrates excellent performance in English, but its usage in low-resource languages is constrained due to poor language generalization. To bridge the gap among different languages, we propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages. Specifically, the multilingual instruction training data (xCOT-INSTRUCT) is created to encourage the semantic alignment of multiple languages. We introduce cross-lingual in-context few-shot learning (xICL)) to accelerate multilingual agreement in instruction tuning, where some fragments of source languages in examples are randomly substituted by their counterpart translations of target languages. During multilingual instruction tuning, we adopt the randomly online CoT strategy to enhance the multilingual reasoning ability of the large language model by first translating the query to another language and then answering in English. To further facilitate the language transfer, we leverage the high-resource CoT to supervise the training of low-resource languages with cross-lingual distillation. Experimental results on previous benchmarks demonstrate the superior performance of xCoT in reducing the gap among different languages, highlighting its potential to reduce the cross-lingual gap.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of performance gap between different languages in cross - language reasoning. Specifically, the Chain - of - Thought (CoT) technique performs well in large - language models (LLMs), especially in complex reasoning tasks, but its application is mainly concentrated in high - resource languages (such as English), and its performance in low - resource languages is poor, resulting in performance differences between different languages. To narrow this gap, the author proposes a cross - language instruction fine - tuning framework (XCOT) to achieve knowledge transfer from high - resource languages to low - resource languages. The following are the main contributions of this paper: 1. **Constructing multilingual instruction data**: By translating English instruction data into 10 other languages (such as German, French, Spanish, etc.), a new multilingual instruction dataset (XCOT - INSTRUCT) is created for the training of cross - language chain - of - thought reasoning. 2. **Random - CoT strategy (Random - CoT)**: During the fine - tuning process, the query is first randomly translated into another language and then answered in English to enhance the multilingual reasoning ability of the LLM. 3. **Cross - lingual distillation**: Use the high - quality reasoning paths of high - resource languages to supervise the training of low - resource languages, further improving the performance of low - resource languages. 4. **Code - switched learning**: By mixing fragments of different languages in examples, the model is encouraged to understand and align the representations of different languages. ### Formula presentation The formulas involved in the paper are as follows: - **Probability model of cross - language CoT**: \[ P(a|q, c)=\prod_{j = 1}^{n}P(a_j|a_{<j};q, c, M) \] where \(q\) is the question, \(c\) is the corresponding example, \(a\) is the answer, and \(M\) is the language model. - **Loss function of cross - language instruction fine - tuning**: \[ L_x=-\sum_{i = 1}^{K}\mathbb{E}_{c_{L_i},q_{L_i},a_{L_j}\sim D_{L_i}}\left[\log P(a_{L_j}|q_{L_i},c_{L_i};M)\right] \] - **Loss function of cross - lingual distillation**: \[ L_d =-\frac{1}{n}\sum_{t = 1}^{n}\left[P_t^{\text{high}}\log P_t^{\text{low}}\right] \] where \(P_t^{\text{high}}\) and \(P_t^{\text{low}}\) are the distributions of high - resource and low - resource languages on the \(t\)-th token respectively. Through these methods, the XCOT framework significantly improves the performance of multilingual reasoning tasks, especially in low - resource languages, thereby narrowing the performance gap between different languages.

xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models

X-Instruction: Aligning Language Model in Low-resource Languages with Self-curated Cross-lingual Instructions

Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning Across Languages

CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes

Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding

Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis

DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models

CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Dual Instruction Tuning with Large Language Models for Mathematical Reasoning

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models