Using Context Information to Enhance Simple Question Answering

Lin Li,Mengjing Zhang,Zhaohui Chao,Jianwen Xiang
DOI: https://doi.org/10.48550/arXiv.1905.01995
2019-04-27
Abstract:With the rapid development of knowledge bases(KBs),question answering(QA)based on KBs has become a hot research issue. In this paper,we propose two frameworks(i.e.,pipeline framework,an end-to-end framework)to focus answering single-relation factoid question. In both of two frameworks,we study the effect of context information on the quality of QA,such as the entity's notable type,out-degree. In the end-to-end framework,we combine char-level encoding and self-attention mechanisms,using weight sharing and multi-task strategies to enhance the accuracy of QA. Experimental results show that context information can get better results of simple QA whether it is the pipeline framework or the end-to-end framework. In addition,we find that the end-to-end framework achieves results competitive with state-of-the-art approaches in terms of accuracy and take much shorter time than them.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is simple question answering (QA) on the knowledge base (KB). Specifically, the author focuses on single - relation factoid questions, which can be answered by a single fact in the knowledge base. The paper proposes two frameworks - the pipeline framework and the end - to - end framework, aiming to improve the quality of simple question answering by using the context information of entities (such as the salient type of entities, out - degree, etc.). ### Background and Objectives of the Paper With the rapid development of knowledge bases, question answering based on knowledge bases has become a research hotspot. The main contribution of this paper lies in exploring the role of context information in simple question answering and proposing a method of combining context information to enhance the accuracy of answer selection. Specifically: 1. **Pipeline Framework**: This framework is divided into two steps, namely entity detection and relation detection. Through these two steps, the system can identify the entities in the question and find the relations related to these entities, thus forming candidate answers. 2. **End - to - End Framework**: This framework combines char - level encoding and self - attention mechanisms, and improves the accuracy of question answering through weight sharing and multi - task strategies. ### Main Methods - **Utilization of Context Information**: - **Out - Degree Information**: The out - degree of an entity refers to the number of triples in which the entity appears as a subject in the knowledge base. In the paper, the entity candidate set is ranked by out - degree information to select the most likely answer. - **Salient Type Information**: The salient type of an entity refers to the most salient category label of the entity in the knowledge base. By calculating the matching score between the question and the entity type, the entity candidate set is further filtered. - **Model Structure**: - **Pipeline Framework**: - **Entity Detection**: Use a bidirectional LSTM network (Bi - LSTM) for entity recognition to generate an entity candidate set. - **Relation Detection**: Design a semantic matching model to select the most matching relation by calculating the similarity between the question and the relation. - **End - to - End Framework**: - Combine char - level encoding and self - attention mechanisms, and improve the performance of the model through multi - task learning strategies. ### Experimental Results The experimental results show that after combining context information, the accuracy of simple question answering has been improved in both the pipeline framework and the end - to - end framework. In particular, the end - to - end framework not only achieves a level comparable to the existing state - of - the - art methods in terms of accuracy, but also greatly shortens the running time. ### Formula Display - **Loss Function**: - **Entity Recognition**: \[ C = -\frac{1}{n}\sum_x [y\ln a+(1 - y)\ln(1 - a)] \] where \(y\) is the expected output and \(a\) is the actual output. - **Relation Matching and Question - Type Information Matching**: \[ C = -\frac{1}{n}[y\ln a+(1 - y)\ln(1 - a)] \] - **Comprehensive Matching Score**: \[ S = S_t+S_r \] where \(S_t\) is the matching score between the entity type information and the question, and \(S_r\) is the matching score between the relation and the question. ### Summary This paper significantly improves the accuracy of simple question answering based on knowledge bases by introducing context information (such as the out - degree and salient type of entities). In particular, in the end - to - end framework, combining char - level encoding and self - attention mechanisms not only improves performance but also significantly shortens the running time. These methods provide research for future knowledge base question answering.