Data Interpreter: An LLM Agent For Data Science

Sirui Hong,Yizhang Lin,Bang Liu,Bangbang Liu,Binhao Wu,Ceyao Zhang,Chenxing Wei,Danyang Li,Jiaqi Chen,Jiayi Zhang,Jinlin Wang,Li Zhang,Lingyao Zhang,Min Yang,Mingchen Zhuge,Taicheng Guo,Tuo Zhou,Wei Tao,Xiangru Tang,Xiangtao Lu,Xiawu Zheng,Xinbing Liang,Yaying Fei,Yuheng Cheng,Zhibin Gou,Zongze Xu,Chenglin Wu

2024-10-15

Abstract:Large Language Model (LLM)-based agents have shown effectiveness across many applications. However, their use in data science scenarios requiring solving long-term interconnected tasks, dynamic data adjustments and domain expertise remains challenging. Previous approaches primarily focus on individual tasks, making it difficult to assess the complete data science workflow. Moreover, they struggle to handle real-time changes in intermediate data and fail to adapt dynamically to evolving task dependencies inherent to data science problems. In this paper, we present Data Interpreter, an LLM-based agent designed to automatically solve various data science problems end-to-end. Our Data Interpreter incorporates two key modules: 1) Hierarchical Graph Modeling, which breaks down complex problems into manageable subproblems, enabling dynamic node generation and graph optimization; and 2) Programmable Node Generation, a technique that refines and verifies each subproblem to iteratively improve code generation results and robustness. Extensive experiments consistently demonstrate the superiority of Data Interpreter. On InfiAgent-DABench, it achieves a 25% performance boost, raising accuracy from 75.9% to 94.9%. For machine learning and open-ended tasks, it improves performance from 88% to 95%, and from 60% to 97%, respectively. Moreover, on the MATH dataset, Data Interpreter achieves remarkable performance with a 26% improvement compared to state-of-the-art baselines. The code is available at <a class="link-external link-https" href="https://github.com/geekan/MetaGPT" rel="external noopener nofollow">this https URL</a>.

Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper attempts to address the challenges faced by large language models (LLMs) in data science scenarios, particularly in handling long-term interconnected tasks, dynamic data adjustments, and domain expertise. Specifically, existing methods mainly focus on single tasks, making it difficult to evaluate the complete data science workflow. Additionally, they struggle with real-time changes in intermediate data and adapting to the dynamic changes in task dependencies inherent in data science problems. To address these issues, the paper proposes an LLM-based agent named **Data Interpreter**, designed to automatically solve various data science problems and manage these tasks end-to-end. Data Interpreter introduces two key modules: 1. **Hierarchical Graph Modeling**: Decomposes complex problems into manageable sub-problems, supporting dynamic node generation and graph optimization. 2. **Programmable Node Generation**: Iteratively improves code generation results and robustness by refining and validating each sub-problem. Experimental results show that Data Interpreter significantly outperforms existing methods in multiple benchmarks, such as a 25% performance improvement on the InfiAgent-DABench dataset and a 26% improvement on the MATH dataset. This indicates that Data Interpreter has higher adaptability and efficiency in handling complex, dynamic data science tasks.

Data Interpreter: An LLM Agent For Data Science

DataLab: A Unified Platform for LLM-Powered Business Intelligence

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

An LLM Agent for Automatic Geospatial Data Analysis

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning

MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

LAMBDA: A Large Model Based Data Agent

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Data science through natural language with ChatGPT's Code Interpreter

SciAgent: Tool-augmented Language Models for Scientific Reasoning

AgentBench: Evaluating LLMs as Agents

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems

Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration

Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow