Pistis-RAG: Enhancing Retrieval-Augmented Generation with Human Feedback

Yu Bai,Yukai Miao,Li Chen,Dawei Wang,Dan Li,Yanyu Ren,Hongtao Xie,Ce Yang,Xuhui Cai
2024-10-31
Abstract:RAG systems face limitations when semantic relevance alone does not guarantee improved generation quality. This issue becomes particularly evident due to the sensitivity of large language models (LLMs) to the ordering of few-shot prompts, which can affect model performance. To address this challenge, aligning LLM outputs with human preferences using structured feedback, such as options to copy, regenerate, or dislike, offers a promising method for improvement. This feedback is applied to the entire list of inputs rather than giving specific ratings for individual documents, making it a Listwide Labels Learning-to-Rank task. To address this task, we propose Pistis-RAG, a new RAG framework designed with a content-centric approach to better align LLMs with human preferences. Pistis-RAG effectively utilizes human feedback, enhancing content ranking and generation quality. To validate our framework, we use public datasets to simulate human feedback, allowing us to evaluate and refine our method effectively. Experimental results indicate that Pistis-RAG improves alignment with human preferences relative to the baseline RAG system, showing a 6.06% increase in MMLU (English) and a 7.08% increase in C-EVAL (Chinese) accuracy metrics. These results highlight Pistis-RAG's effectiveness in overcoming the limitations associated with traditional RAG approaches.
Information Retrieval,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that when existing RAG (Retrieval - Augmented Generation) systems rely solely on semantic relevance, they cannot guarantee an improvement in generation quality. In particular, since large - language models (LLMs) are sensitive to the order of few - shot prompts, this may affect the model performance. To address this challenge, the paper proposes a new RAG framework - Pistis - RAG, which aims to better align the output of LLMs with human preferences through structured user feedback (such as copy, regenerate or dislike options). This feedback is applied to the entire input list rather than scoring individual documents, so it is regarded as a learning - to - rank task of "list - level labels". Pistis - RAG operates through two main stages: feedback alignment and online query. In the feedback alignment stage, human feedback is utilized through online learning to increase the sensitivity of the ranking model to human and LLM preferences and to adapt to changing expectations. In the query stage, Pistis - RAG uses a true ranker to re - order the retrieved content according to the optimized ranking model, considering semantic relevance and the order presented to the LLM, ensuring that the final output conforms to human preferences and is consistent with the LLM's generation ability. The paper verifies the effectiveness of its method by simulating human feedback on public datasets. The experimental results show that Pistis - RAG improves by 6.06% and 7.08% respectively on the MMLU (English) and C - EVAL (Chinese) accuracy metrics, demonstrating the effectiveness of this framework in overcoming the limitations of traditional RAG methods.