Xinyi He,Mengyu Zhou,Xinrun Xu,Xiaojun Ma,Rui Ding,Lun Du,Yan Gao,Ran Jia,Xu Chen,Shi Han,Zejian Yuan,Dongmei Zhang
Abstract:Tabular data analysis is crucial in various fields, and large language models show promise in this area. However, current research mostly focuses on rudimentary tasks like Text2SQL and TableQA, neglecting advanced analysis like forecasting and chart generation. To address this gap, we developed the Text2Analysis benchmark, incorporating advanced analysis tasks that go beyond the SQL-compatible operations and require more in-depth analysis. We also develop five innovative and effective annotation methods, harnessing the capabilities of large language models to enhance data quality and quantity. Additionally, we include unclear queries that resemble real-world user questions to test how well models can understand and tackle such challenges. Finally, we collect 2249 query-result pairs with 347 tables. We evaluate five state-of-the-art models using three different metrics and the results show that our benchmark presents introduces considerable challenge in the field of tabular data analysis, paving the way for more advanced research opportunities.
What problem does this paper attempt to address?
The paper aims to address two major issues in the current field of data table analysis:
1. **Lack of advanced data analysis tasks**: Existing research works, such as Text2SQL and TableQA datasets, mainly focus on basic operations of descriptive analysis (e.g., simple queries and summaries), while neglecting tasks that require deeper analytical capabilities, such as prediction, chart generation, etc.
2. **Handling unclear queries**: In practical applications, users' queries are often unclear or lack parameters, which poses challenges for automated data analysis tools.
To address these issues, the paper proposes a new benchmark dataset named Text2Analysis, which includes both advanced data analysis tasks and unclear queries. Specifically, the benchmark dataset covers the following points:
- **Advanced data analysis tasks**: Including basic insights (e.g., ranking, trends, etc.), prediction (forecasting future based on historical data), and chart generation (recommending and constructing charts).
- **Unclear queries**: These queries lack the key information needed to perform specific tasks, requiring the model to not only understand natural language but also possess certain data analysis capabilities to recommend appropriate analysis solutions.
Additionally, the paper develops five innovative and reliable annotation methods, leveraging the capabilities of large language models to improve annotation efficiency and data volume, while ensuring the quality of the dataset. The final collected dataset contains 2249 query-result pairs, involving 347 different tables. Five state-of-the-art models were evaluated using three different evaluation metrics (executable code ratio, pass rate, and regression metrics), and the results show that these models perform well in handling clear queries but face challenges with complex libraries and unclear queries.
In summary, the goal of this paper is to advance research in the field of data table analysis, particularly in advanced data analysis tasks and handling unclear user queries.