Tabular reasoning via two-stage knowledge injection

Qi Shi,Yu Zhang,Ting Liu
DOI: https://doi.org/10.1007/s13042-023-02073-4
2024-02-10
International Journal of Machine Learning and Cybernetics
Abstract:Tabular reasoning presents a significant challenge in understanding natural language queries in the context of provided tables, mainly because of the complex logical operations involved. Pre-trained language models have demonstrated their capabilities in various tasks. However, performing pre-training specifically for tabular reasoning is difficult due to the diverse range of reasoning abilities required beyond contextual understanding. In this work, we propose Tabular Reasoning with T wo- s tage K nowledge I njection ( TsKI ). TsKI consists of two components: TsKI and TsKI . The primary objective of TsKI is to incorporate symbolic knowledge into pre-trained language models by utilizing synthesized programs. It begins by generating high-quality programs using a specific program synthesis algorithm. Next, TsKI conducts pre-training on the automatically generated corpus, enabling the model to learn how to query tables using the generated programs. On the other hand, TsKI aims to inject step-wise knowledge into the model. It starts by decomposing natural language queries into multiple sub-queries using heuristic rules and a constituency parser. Then, it employs pre-trained language models themselves to query tables with the obtained sub-queries, obtaining intermediate results that facilitate step-wise tabular reasoning. Experimental results demonstrate the effectiveness of our proposed approach. TsKI achieves significant improvements on two well-known tabular reasoning datasets, namely TabFact and WikiTableQuestions , in both TsKI and TsKI . Furthermore, in-depth analysis validates the effectiveness of each component of our approach. The code is available at https://github.com/qshi95/TsKI.
computer science, artificial intelligence
What problem does this paper attempt to address?