Knowledge Rich Natural Language Queries over Structured Biological Databases

Hasan M. Jamil

DOI: https://doi.org/10.48550/arXiv.1703.10692

2017-03-31

Abstract:Increasingly, keyword, natural language and NoSQL queries are being used for information retrieval from traditional as well as non-traditional databases such as web, document, image, GIS, legal, and health databases. While their popularity are undeniable for obvious reasons, their engineering is far from simple. In most part, semantics and intent preserving mapping of a well understood natural language query expressed over a structured database schema to a structured query language is still a difficult task, and research to tame the complexity is intense. In this paper, we propose a multi-level knowledge-based middleware to facilitate such mappings that separate the conceptual level from the physical level. We augment these multi-level abstractions with a concept reasoner and a query strategy engine to dynamically link arbitrary natural language querying to well defined structured queries. We demonstrate the feasibility of our approach by presenting a Datalog based prototype system, called BioSmart, that can compute responses to arbitrary natural language queries over arbitrary databases once a syntactic classification of the natural language query is made.

Databases

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the ability to perform natural - language queries on structured biological databases. Specifically, the paper focuses on how to map natural - language queries to a structured query language (such as SQL) so that information can be retrieved from traditional and non - traditional databases. The paper proposes a multi - level knowledge - based middleware to facilitate this mapping. This middleware separates the conceptual layer from the physical layer and combines a concept reasoner and a query strategy engine to dynamically link any natural - language query to a well - defined structured query. In addition, the paper demonstrates the feasibility of its method through a Datalog - based prototype system, BioSmart, which can calculate responses to any database after syntactic classification of natural - language queries. In short, the main goal of the paper is to develop a system that can understand and execute complex natural - language queries, especially on biological databases in the life - science field, thereby improving the convenience and efficiency of data access. This involves several key technical challenges, including natural - language processing, query optimization, and how to effectively use background knowledge to enhance the quality of query responses.

Knowledge Rich Natural Language Queries over Structured Biological Databases

Keyword and Natural Language Query Processing for Semi-Structured Data Sources

Towards a Natural Language Query Processing System

Interactive Natural Language Question Answering over Knowledge Graphs

Querying semantic catalogues of biomedical databases

Querying Biomedical Ontologies in Natural Language using Answer Set

Automatic Understanding of Natural Language Questions for Querying Chinese Knowledge Bases

Ontology-based Natural Language Interface to Relational Databases

Querying knowledge graphs in natural language

Large-Scale Knowledge Synthesis and Complex Information Retrieval from Biomedical Documents

Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

Querying Knowledge via Multi-Hop English Questions

Using Natural Language to Access Databases on the Web.

Frameworks for Querying Databases Using Natural Language: A Literature Review

A State-transition Framework to Answer Complex Questions over Knowledge Base

Case-based Reasoning for Natural Language Queries over Knowledge Bases

Answering Natural Language Questions via Phrasal Semantic Parsing.

A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora

A Semantic Network for Modeling Biological Knowledge in Multiple Databases

Knowledge Graphs Querying

Natural Language Question/Answering with User Interaction over a Knowledge Base