NL2SQL with Partial Missing Metadata Based on Multi-View Metadata Graph Compensation and Reasoning

Jie Lin,Yulong Liang,Jiyan Li,Yi Bai,Yong Wang
DOI: https://doi.org/10.1007/s10489-023-05221-z
IF: 5.3
2024-01-01
Applied Intelligence
Abstract:The performance of metadata-dependent NL2SQL models will be seriously decreased, while facing the incomplete or distorted metadata information. In response to this problem, we proposed a metadata compensation approach, which represents the question together with SQL query, data cell value relevance and incomplete schema data as a global metadata graph, and applies knowledge graph reasoning to complete the metadata graph. This global metadata graph is a multi-graph. An improved transR model was proposed to represent this multi-graph by integrating the contributions from multiple relationships between two nodes. Depending on the compensated metadata graph, new end-to-end and preprocess improving frameworks were respectively constructed for adapting to different metadata-dependent NL2SQL systems. The new models have been evaluated on Spider dataset with artificially simulated partial metadata relation deficiency or metadata distortion. Except ablation comparing, the new models also have been compared with some approaches of existing and have demonstrated improved performance.
What problem does this paper attempt to address?