LB-KBQA: Large-language-model and BERT based Knowledge-Based Question and Answering System

Yan Zhao,Zhongyun Li,Yushan Pan,Jiaxing Wang,Yihong Wang
2024-02-09
Abstract:Generative Artificial Intelligence (AI), because of its emergent abilities, has empowered various fields, one typical of which is large language models (LLMs). One of the typical application fields of Generative AI is large language models (LLMs), and the natural language understanding capability of LLM is dramatically improved when compared with conventional AI-based methods. The natural language understanding capability has always been a barrier to the intent recognition performance of the Knowledge-Based-Question-and-Answer (KBQA) system, which arises from linguistic diversity and the newly appeared intent. Conventional AI-based methods for intent recognition can be divided into semantic parsing-based and model-based approaches. However, both of the methods suffer from limited resources in intent recognition. To address this issue, we propose a novel KBQA system based on a Large Language Model(LLM) and BERT (LB-KBQA). With the help of generative AI, our proposed method could detect newly appeared intent and acquire new knowledge. In experiments on financial domain question answering, our model has demonstrated superior effectiveness.
Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily addresses the issue of intent recognition in Knowledge Base Question Answering (KBQA) systems, particularly how to handle unseen intents. In traditional KBQA systems, due to the diversity of language and emerging intents, intent recognition failures often occur, directly affecting the performance of the QA system. To solve the above problem, the paper proposes a Knowledge Base Question Answering system based on Large Language Models (LLM) and BERT, named LB-KBQA. The LB-KBQA system aims to improve the recognition of unseen intents by introducing large language models, thereby enhancing the overall performance of the KBQA system. Specifically, the system includes the following key components: 1. **Language Preprocessing Module**: Removes irrelevant information such as stop words and punctuation from the input text to improve search efficiency and accuracy. 2. **Intent Recognition Module**: Includes a rule-based model, high-dimensional semantic representation based on BERT, and a part that handles unseen intents using pre-trained language models. These components work together to address intent recognition failures caused by language diversity. 3. **Response Generation Module**: Generates highly readable answers based on the user's intent. 4. **Adaptive Learning Module**: Gradually understands the user's true intent through multi-turn conversations with the user and updates the intent library. 5. **Query Library Expansion Module**: Allows users to expand the existing knowledge graph to build domain-specific knowledge graphs and integrate them into the QA system. Through experiments on a financial domain dataset, the authors demonstrate the effectiveness of the LB-KBQA system, especially in handling unseen intents. The experimental results show that the BERT model is crucial for discovering unseen intents, and combining large language models with the adaptive learning module can significantly improve the system's ability to cope with language diversity and recognize unseen intents.