LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models

Qianxi Li,Yingyue Cao,Jikun Kang,Tianpei Yang,Xi Chen,Jun Jin,Matthew E. Taylor
2024-01-01
Abstract:Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance. Supervised Fine-Tuning (SFT) is a common approach, where an LLM is trained to produce desired answers. However, LLMs trained with SFT sometimes make simple mistakes and result in hallucinations on reasoning tasks such as question-answering. Without external feedback, it is difficult for SFT to learn a good mapping between the question and the desired answer, especially with a small dataset. This paper introduces an alternative to SFT called Natural Language Feedback for Finetuning LLMs (LaFFi). LaFFi has LLMs directly predict the feedback they will receive from an annotator. We find that requiring such reflection can significantly improve the accuracy in in-domain question-answering tasks, providing a promising direction for the application of natural language feedback in the realm of SFT LLMs. Additional ablation studies show that the portion of human-annotated data in the annotated datasets affects the fine-tuning performance.
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in Supervised Fine - Tuning (SFT), large language models (LLMs) may make simple mistakes and have hallucination phenomena when performing inference tasks. These problems are particularly prominent when the dataset is small. Due to the lack of external feedback, it is difficult for SFT to establish an effective mapping between input questions and correct answers. Therefore, the paper proposes a new fine - tuning framework - LaFFi (Natural Language Feedback for Finetuning LLMs), which improves the performance of the model by making the model predict the natural language feedback it will receive from annotators. This method aims to improve the accuracy of the model in answering questions in specific fields, especially in the case of limited data, by increasing the amount of information in the training data. Specifically, the LaFFi framework contains four key steps: 1. **Answer Prediction**: Use a pre - trained LLM to generate answers to questions in the training dataset. 2. **Feedback Annotation**: Use the LLaMA 7B model and human annotators to annotate natural language feedback for the predicted answers. 3. **Supervised Feedback Prediction**: Different from traditional SFT, LaFFi combines paragraphs, questions, and predicted answers into a comprehensive input context, and the task of the model is to predict the feedback that may be received for the provided answers. 4. **LoRA Fine - Tuning**: Adopt the parameter - efficient fine - tuning technique LoRA. By decomposing the multi - layer perceptron (MLPs) in the Transformer architecture into low - rank matrices and integrating them into MLPs while keeping other weights of the pre - trained model unchanged, the pre - trained knowledge can be maximally retained and the computational cost can be reduced. In this way, LaFFi not only improves the performance of the model on small datasets but also enhances the model's ability to capture local and global dependencies, thereby further improving the performance of the model.