How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering

Jinxin Liu,Shulin Cao,Jiaxin Shi,Tingjian Zhang,Lunyiu Nie,Linmei Hu,Lei Hou,Juanzi Li

2024-06-14

Abstract:Knowledge Base Question Answering (KBQA) aims to answer natural language questions based on facts in knowledge bases. A typical approach to KBQA is semantic parsing, which translates a question into an executable logical form in a formal language. Recent works leverage the capabilities of large language models (LLMs) for logical form generation to improve performance. However, although it is validated that LLMs are capable of solving some KBQA problems, there has been little discussion on the differences in LLMs' proficiency in formal languages used in semantic parsing. In this work, we propose to evaluate the understanding and generation ability of LLMs to deal with differently structured logical forms by examining the inter-conversion of natural and formal language through in-context learning of LLMs. Extensive experiments with models of different sizes show that state-of-the-art LLMs can understand formal languages as well as humans, but generating correct logical forms given a few examples remains a challenge. Most importantly, our results also indicate that LLMs exhibit considerable sensitivity. In general, the formal language with a lower formalization level, i.e., the more similar it is to natural language, is more friendly to LLMs.

Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to evaluate the capabilities of large - language models (LLMs) in formal language understanding and generation. Specifically, the paper focuses on the following points: 1. **Formal language understanding**: Translating logical forms (LFs) into corresponding natural - language questions (NLQs), which requires the model to be able to interpret the provided logical forms and demonstrate its ability to understand formal languages. 2. **Formal language generation**: Correctly converting natural - language questions into their corresponding logical forms. This not only requires the model to understand the questions but also to generate the correct logical forms, demonstrating its generation ability. The paper examines the performance of LLMs of different scales when dealing with logical forms of different structures by designing two evaluation tasks - formal language understanding and formal language generation. The authors selected three representative formal languages (Lambda DCS, SPARQL, and KoPL) for evaluation. These languages are commonly used in knowledge - base question - answering research and have different levels of formalization and logical structures. Through extensive experiments, the paper aims to reveal the capacity boundaries of LLMs in formal - language processing and provide guidance for choosing appropriate models and formal languages.

How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering

Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models

LB-KBQA: Large-language-model and BERT based Knowledge-Based Question and Answering System

Enhancing Large Language Models with Knowledge Graphs for Robust Question Answering

Can Language Models Act as Knowledge Bases at Scale?

Research on Intelligent Question-Answering Systems Based on Large Language Models and Knowledge Graphs

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models

Leveraging Large Language Models for Multiple Choice Question Answering

A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering

Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

Code-Style In-Context Learning for Knowledge-Based Question Answering

Effective Search of Logical Forms for Weakly Supervised Knowledge-Based Question Answering

An In-Context Schema Understanding Method for Knowledge Base Question Answering

Domain Specific Question Answering Over Knowledge Graphs Using Logical Programming and Large Language Models

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond

Do Large Language Models Understand Logic or Just Mimick Context?

Logical Form Generation via Multi-task Learning for Complex Question Answering over Knowledge Bases.

Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering

Systematic Assessment of Factual Knowledge in Large Language Models