A Three-Phases SFT Hybrid Model Integrated Strong Prior Module and Data Overlap Estimation in the Eduation Context

Zhangquan Chen,Chunjiang Liu,Haobin Duan
2024-03-13
Abstract:In this paper, we propose an end-to-end prior-based three-phases supervised fine-tuned model, which is proved more competitive than traditional fine-tuning method. More specifically, our model realizes the structural disassembly and incremental guided output of educational knowledge. To this end, we robustify data classification of three types via a sampler and overlap estimation neural network, and inject the preprocessing datasets into pre-trained model in three batches for LORA fine-tuning. Then, we design a prior module couples system prompt, vector databases, and abstract syntax tree task segmentation. Finally, the compression method and regularization constraint are applied to the prior-based fine-tuned model, followed by text filter at the output end to obtain incremental guided results. Our model represents the first research effort to truly embody the tutor role with the features of abundant educational knowledge, step-by-step incremental guided outputs and non-disclosure of answers. Extensive experiments report that our model also achieves state-of-the-art in code abilities compared to open-source models, reaching an impressive 75.10% on the HumanEval (@pass 1) benchmark. Additionally, our model maintains strong conversational capabilities, with the 13B quantized version achieving scores of 56.34, 50.60, and 45.27 respectively on the MMLU, C-Eval, and AGIEval (5 shot) dialogue evaluation benchmarks.
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The paper aims to address the challenges faced by general large language models (LLMs) in specific domain applications, particularly in the field of education. Specifically, while general large language models can solve general natural language processing problems, they exhibit a lack of specialized knowledge and cognitive abilities in professional domains (such as education). Therefore, the paper proposes a three-stage supervised fine-tuning (SFT) model to achieve the following goals: 1. **Data Preprocessing Optimization**: Preprocess data through an overlap estimation network to ensure a high-quality dataset for subsequent fine-tuning processes. 2. **Three-Stage LORA Fine-Tuning**: Utilize the LORA method for three stages of fine-tuning, training on code data, educational awareness, and instructional dialogue data respectively, to significantly improve the model's performance in specific domains. 3. **Prior Module Design**: Design a comprehensive prior module that integrates a vector database, abstract syntax tree (AST), and efficient system prompts to achieve strong associative constraints related to the teacher role. 4. **Model Optimization**: Optimize the educational model through methods such as regularization constraints, model compression, and text filtering, demonstrating its feasibility as a solution in educational scenarios. 5. **Achieving a True Mentor Role**: Enable the model to truly embody the role of a mentor, achieving the best state among open-source models in terms of coding ability, and demonstrating excellent accuracy and robustness in multiple comparative experiments. Through these technical means, the paper aims to construct a step-by-step guided output system with strong robustness, high precision, fast inference, and low GPU resource consumption, particularly suitable for vertical task requirements of small to medium-scale models.