Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering

Yichi Zhang,Zhuo Chen,Yin Fang,Yanxi Lu,Fangming Li,Wen Zhang,Huajun Chen
2024-06-10
Abstract:Deploying large language models (LLMs) to real scenarios for domain-specific question answering (QA) is a key thrust for LLM applications, which poses numerous challenges, especially in ensuring that responses are both accommodating to user requirements and appropriately leveraging domain-specific knowledge bases. They are the two major difficulties for LLM application as vanilla fine-tuning falls short of addressing. Combining these requirements, we conceive of them as the requirement for the model's preference to be harmoniously aligned with humans'. Thus, we introduce Knowledgeable Preference AlignmenT (KnowPAT), which constructs two kinds of preference sets to tackle the two issues. Besides, we design a new alignment objective to align the LLM preference with different human preferences uniformly, aiming to optimize LLM performance in real-world, domain-specific QA settings. Adequate experiments and comprehensive comparisons with 15 baseline methods illustrate that our KnowPAT is a superior pipeline for real-scenario domain-specific QA with LLMs.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The main aim of this paper is to address the application issues of large language models (LLMs) in domain-specific question answering (QA) tasks, particularly how to ensure that the model's responses both meet user needs and effectively utilize domain-specific knowledge bases. Specifically, the paper proposes the following key issues: 1. **Overview of Challenges**: - Ensuring that the responses generated by LLMs meet user requirements while appropriately utilizing domain-specific knowledge bases. - Traditional vanilla fine-tuning methods are inadequate to address these two main issues. 2. **Research Objectives**: - Addressing the problem of how to leverage external knowledge bases to support LLMs in domain-specific QA tasks in practical application scenarios. - Achieving harmony between LLMs' preferences and human expectations, i.e., Preference Alignment (PA). 3. **Proposed Method**: - **KnowPAT Framework**: A novel three-step process for handling domain-specific QA tasks to achieve practical application of LLMs. - Step 1: Unsupervised knowledge retrieval to obtain relevant knowledge from the knowledge base. - Step 2: Constructing a knowledge preference set, combining style preference set and knowledge preference set. - Step 3: Fine-tuning with knowledge preferences and preference alignment training. 4. **Summary of Contributions**: - Introducing preference alignment into the QA tasks utilizing LLMs and domain-specific knowledge bases for the first time. - Proposing a preference alignment framework (KnowPAT) that integrates knowledge bases, balancing text style preferences and knowledge preferences, and designing a new training objective to achieve consistency between the model and human preferences. - Experimental validation of the method's effectiveness, outperforming 15 baseline methods on two datasets. In summary, this paper aims to solve the preference alignment problem in domain-specific QA tasks for LLMs by proposing the KnowPAT framework, ensuring that the generated responses both meet user needs and fully utilize domain-specific knowledge bases.