Abstract:Question answering (QA) has become a popular way for humans to access billion-scale knowledge bases. Unlike web search, QA over a knowledge base gives out accurate and concise results, provided that natural language questions can be understood and mapped precisely to structured queries over the knowledge base. The challenge, however, is that a human can ask one question in many different ways. Previous approaches have natural limits due to their representations: rule based approaches only understand a small set of "canned" questions, while keyword based or synonym based approaches cannot fully understand the questions. In this paper, we design a new kind of question representation: templates, over a billion scale knowledge base and a million scale QA corpora. For example, for questions about a city's population, we learn templates such as What's the population of $city?, How many people are there in $city?. We learned 27 million templates for 2782 intents. Based on these templates, our QA system KBQA effectively supports binary factoid questions, as well as complex questions which are composed of a series of binary factoid questions. Furthermore, we expand predicates in RDF knowledge base, which boosts the coverage of knowledge base by 57 times. Our QA system beats all other state-of-art works on both effectiveness and efficiency over QALD benchmarks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the design of an efficient and accurate question - answering system on a large - scale knowledge base. Specifically, the paper focuses on how to understand and process natural language questions and precisely map them to structured queries in order to obtain accurate answers from the knowledge base. This involves two main challenges: 1. **Representation Design**: How to design a method that can understand and represent natural - language questions. These questions can be described as thousands of intents, and each intent may have thousands of different expressions. For example, "How many people are there in Honolulu?" and "What is the population of Honolulu?" have the same semantics although their expressions are different. Therefore, a representation method is required to identify questions with the same semantics and distinguish different question intents. 2. **Semantic Matching**: After determining the representation of the questions, how to map these representations to the structured queries in the knowledge base. For binary fact - type questions (BFQ), the structured queries mainly rely on the predicates in the knowledge base. However, due to the gap between natural - language questions and knowledge - base predicates, it is not easy to find this mapping relationship. For example, in Table 1, it is necessary to know that "How many people are there in Honolulu?" corresponds to the predicate "population". In addition, many binary relations are not represented by a single edge in the RDF graph, but by complex path structures. For example, the "spouse" relationship is represented by the path "marriage → person → name" in Figure 1. To address these challenges, the paper proposes a new template - based method. By learning a large number of templates to represent and understand natural - language questions and mapping these templates to the predicates in the knowledge base, this method can not only handle simple binary fact - type questions, but also complex fact - type questions. The latter can be solved by decomposing them into a series of binary fact - type questions. The paper achieves effective question - answering for large - scale knowledge bases by learning templates and their mapping relationships with knowledge - base predicates from Yahoo! Answers.

KBQA: Learning Question Answering over QA Corpora and Knowledge Bases

KBQA: an Online Template Based Question Answering System over Freebase.

BB-KBQA: BERT-Based Knowledge Base Question Answering

Knowledge-Enhanced Retrieval: A Scheme for Question Answering

Hybrid Question Answering over Knowledge Base and Free Text.

ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models

Novel Knowledge-Based System with Relation Detection and Textual Evidence for Question Answering Research.

A Survey on Complex Knowledge Base Question Answering: Methods, Challenges and Solutions

How Question Generation Can Help Question Answering over Knowledge Base

Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models

A Two-Stage Approach towards Generalization in Knowledge Base Question Answering

Core techniques of question answering systems over knowledge bases: a survey

Complex Knowledge Base Question Answering: A Survey

Few-shot Multi-hop Question Answering over Knowledge Base

Open Domain Question Answering Via Semantic Enrichment

TIARA: Multi-grained Retrieval for Robust Question Answering over Large Knowledge Bases

Fusing Essential Knowledge for Text-Based Open-Domain Question Answering

In-Context Learning for Knowledge Base Question Answering for Unmanned Systems based on Large Language Models

Knowledge-Aided Open-Domain Question Answering

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

Question Answering (QA) Basics