CRKG: combining retrieval knowledge with generative language models

Fei Chen,Carter Zhang,Bo Ning
DOI: https://doi.org/10.1007/s11227-024-06728-z
IF: 3.3
2024-12-09
The Journal of Supercomputing
Abstract:Multi-turn dialogue generation tasks heavily rely on capturing contextual information. However, in real-life scenarios, capturing the speaker's needs accurately cannot be achieved solely with limited context, so background knowledge information is also necessary. Existing works focus on using local keywords to retrieve external knowledge and simply concatenating retrieval information with context, which results in low-quality retrieved external knowledge and redundant context, leading to difficulty in understanding the context. To address these issues, this paper proposes the CRKG model. The CRKG mode first designs a turn-level attention mechanism to capture important information in the context. Then, it retrieves knowledge from historical dialogues as an external knowledge-base based on the important information representation. Finally, it designs a hierarchical fusion encoder to dynamically integrate the retrieved information. We validate our proposed method on text-based small parameter size model and large language model. Experimental results show that our proposed method achieves the best results on multiple public datasets.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?