Awakening Augmented Generation: Learning to Awaken Internal Knowledge of Large Language Models for Question Answering

Huanxuan Liao,Shizhu He,Yao Xu,Yuanzhe Zhang,Kang Liu,Shengping Liu,Jun Zhao
2024-09-20
Abstract:Retrieval-Augmented-Generation and Generation-Augmented-Generation have been proposed to enhance the knowledge required for question answering with Large Language Models (LLMs) by leveraging richer context. However, the former relies on external resources, and both require incorporating explicit documents into the context, which increases execution costs and susceptibility to noise data during inference. Recent works indicate that LLMs model rich knowledge, but it is often not effectively activated and awakened. Inspired by this, we propose a novel knowledge-augmented framework, $\textbf{Awakening-Augmented-Generation}$ (AAG), which mimics the human ability to answer questions using only thinking and recalling to compensate for knowledge gaps, thereby awaking relevant knowledge in LLMs without relying on external resources. AAG consists of two key components for awakening richer context. Explicit awakening fine-tunes a context generator to create a synthetic, compressed document that functions as symbolic context. Implicit awakening utilizes a hypernetwork to generate adapters based on the question and synthetic document, which are inserted into LLMs to serve as parameter context. Experimental results on three datasets demonstrate that AAG exhibits significant advantages in both open-domain and closed-book settings, as well as in out-of-distribution generalization. Our code will be available at \url{<a class="link-external link-https" href="https://github.com/Xnhyacinth/IAG" rel="external noopener nofollow">this https URL</a>}.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper attempts to address the issue of large language models (LLMs) performing poorly on knowledge-intensive tasks, particularly in question answering without the need for external resources. Specifically, existing methods such as Retrieval-Augmented Generation (RAG) and Generation-Augmented Generation (GAG) can improve the question-answering capabilities of LLMs but rely on external resources and a large number of documents, leading to high execution costs and susceptibility to noisy data. Additionally, these methods often require retraining for different domains, tasks, and datasets, resulting in inefficient resource utilization. The paper proposes a new knowledge-enhancement framework—Awakening-Augmented-Generation (AAG), which aims to address these issues by activating the internal knowledge of LLMs. AAG mimics the human ability to answer questions solely through thinking and recalling, without relying on external resources, thereby improving question-answering performance without increasing computational costs. ### Main Contributions 1. **Proposed a new knowledge-enhancement framework AAG**: Capable of more efficiently awakening rich contexts (symbolic context and parametric context) without relying on external resources. 2. **Utilized text-conditioned hypernetworks to generate parameter-efficient modules**: As parametric context, generating adapters based on questions and compressed virtual documents. 3. **Experimental results indicate**: AAG shows significant advantages in both open-domain and closed-book settings while reducing inference costs. ### Method Overview AAG mainly consists of two modules: 1. **Explicit Awakening**: Learning to generate compressed virtual documents through long-context compression, reducing input length, avoiding reliance on fixed knowledge bases, and minimizing knowledge corpus errors. 2. **Implicit Awakening**: Utilizing hypernetworks to generate lightweight LoRA modules, aligning questions with internal knowledge, enabling the model to better handle diverse and complex tasks. ### Experimental Results - **Supervised Setting**: In the closed-book setting, AAG improves the EM score by an average of 2% over baseline methods, with the improvement becoming more pronounced as the model size increases. - **Zero-Shot Setting**: AAG significantly enhances LLM performance in zero-shot settings, for example, Llama2-7B shows an average EM score improvement of 14% across three datasets. - **Generalization Ability**: AAG performs well in out-of-distribution (OOD) tests, achieving performance close to or exceeding the FiD method using 10 retrieved documents, even with just one virtual document. ### Conclusion AAG effectively addresses the issues of reliance on external resources and high computational costs in existing methods by activating the internal knowledge of LLMs, providing an efficient and general solution for knowledge-intensive tasks.