LogiCoT: Logical Chain-of-Thought Instruction-Tuning

Hanmeng Liu,Zhiyang Teng,Leyang Cui,Chaoli Zhang,Qiji Zhou,Yue Zhang

2023-10-28

Abstract:Generative Pre-trained Transformer 4 (GPT-4) demonstrates impressive chain-of-thought reasoning ability. Recent work on self-instruction tuning, such as Alpaca, has focused on enhancing the general proficiency of models. These instructions enable the model to achieve performance comparable to GPT-3.5 on general tasks like open-domain text generation and paraphrasing. However, they fall short of helping the model handle complex reasoning tasks. To bridge the gap, this paper presents LogiCoT, a new instruction-tuning dataset for Logical Chain-of-Thought reasoning with GPT-4. We elaborate on the process of harvesting instructions for prompting GPT-4 to generate chain-of-thought rationales. LogiCoT serves as an instruction set for teaching models of logical reasoning and elicits general reasoning skills.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The paper attempts to address the issue of current large language models (LLMs) underperforming in logical reasoning tasks, particularly the lack of multi-step logical reasoning capabilities. Although existing self-guided fine-tuning methods (such as Alpaca) can improve the model's performance on general tasks, they are still insufficient when dealing with complex reasoning tasks. To this end, the paper proposes a new instruction fine-tuning dataset—LogiCoT, specifically designed to enhance the model's Chain-of-Thought (CoT) reasoning ability. Specifically, the main contributions of the paper include: 1. **Constructing the LogiCoT dataset**: By leveraging the powerful generation capabilities of GPT-4, logical reasoning instructions are extracted and constructed from existing logical reasoning datasets to form a high-quality Chain-of-Thought fine-tuning dataset. 2. **Enhancing logical reasoning capabilities**: By performing instruction fine-tuning on the LLaMA-7b model, the effectiveness of the LogiCoT dataset is validated. Experimental results show that the model fine-tuned with LogiCoT exhibits significant improvement in logical reasoning benchmark tests. 3. **Expanding the application scope**: In addition to logical reasoning tasks, the fine-tuned model also performs well in general human-centric language model benchmark tests, demonstrating its generalization ability. In summary, the paper aims to address the shortcomings of existing models in logical reasoning capabilities by constructing a specialized dataset, thereby promoting the application and development of large language models in complex reasoning tasks.

LogiCoT: Logical Chain-of-Thought Instruction-Tuning

When do you need Chain-of-Thought Prompting for ChatGPT?

Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models

An electronic blood-cell counting machine.

Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration

Improve Vision Language Model Chain-of-thought Reasoning

Chain of Thoughtlessness? An Analysis of CoT in Planning

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models

An automatically discovered chain-of-thought prompt generalizes to novel models and datasets

MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Supervised Chain of Thought

Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models

mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale