WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Liwenhan Xie,Chengbo Zheng,Haijun Xia,Huamin Qu,Chen Zhu-Tian

DOI: https://doi.org/10.1145/3654777.3676374

2024-08-03

Abstract:Large language models (LLMs) support data analysis through conversational user interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing diverse analysis tasks. However, presenting raw code can obscure the logic and hinder user verification. To empower users with enhanced comprehension and augmented control over analysis conducted by LLMs, we propose a novel approach to transform LLM-generated code into an interactive visual representation. In the approach, users are provided with a clear, step-by-step visualization of the LLM-generated code in real time, allowing them to understand, verify, and modify individual data operations in the analysis. Our design decisions are informed by a formative study (N=8) probing into user practice and challenges. We further developed a prototype named WaitGPT and conducted a user study (N=12) to evaluate its usability and effectiveness. The findings from the user study reveal that WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling participants to enhance error detection and increase their overall confidence in the results.

Human-Computer Interaction

What problem does this paper attempt to address?

This paper aims to address the issues of code generated by large language models (LLMs) being difficult to understand and verify during data analysis, and proposes a new method to enhance users' monitoring and control over LLM-generated data analysis scripts. Specifically, the study identifies the following problems with current LLM tools in data analysis: 1. **Code is difficult to understand**: Directly displaying raw code makes it hard for users to grasp the logic, especially when their programming skills are limited. 2. **Reliability issues**: The code generated by LLMs may contain errors or misunderstand the user's intent, requiring users to check and correct it. 3. **Low interaction efficiency**: Modifying code through dialogue can become very cumbersome and inefficient. To solve these problems, the paper proposes a new system called WaitGPT, which transforms LLM-generated code into a visual and interactive form. This allows users to view the data operation process in real-time, thereby better understanding, verifying, and adjusting each step. The main features of WaitGPT include: - **Real-time visualization**: Dynamically generates visual representations of the code, enabling users to intuitively see each data operation and its results. - **Interactive modification**: Users can directly interact with visual elements to modify data operations without needing to regenerate the entire code. - **Step-by-step execution**: Executes the code line by line and updates visual charts, displaying intermediate states, making it easier for users to intervene at any time. In this way, WaitGPT not only improves users' understanding and confidence in the data analysis process but also enhances their ability to detect and correct errors.

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Data science through natural language with ChatGPT's Code Interpreter

DiagGPT: An LLM-based Chatbot with Automatic Topic Management for Task-Oriented Dialogue

The GPT Surprise: Offering Large Language Model Chat in a Massive Coding Class Reduced Engagement but Increased Adopters Exam Performances

Graphologue: Exploring Large Language Model Responses with Interactive Diagrams

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

Hidden in Plain Sight: Exploring Chat History Tampering in Interactive Language Models

Analysis of ChatGPT on Source Code

Let's Ask AI About Their Programs: Exploring ChatGPT's Answers To Program Comprehension Questions

The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions

Is ChatGPT the Ultimate Programming Assistant -- How far is it?

IntelliExplain: Enhancing Conversational Code Generation for Non-Professional Programmers

StuGPTViz: A Visual Analytics Approach to Understand Student-ChatGPT Interactions

ChatLogic: Integrating Logic Programming with Large Language Models for Multi-Step Reasoning

Using an LLM to Help With Code Understanding

ChatGPT Alternative Solutions: Large Language Models Survey

CogAgent: A Visual Language Model for GUI Agents

What You See Is Not Always What You Get: An Empirical Study of Code Comprehension by Large Language Models

Steering Large Language Models between Code Execution and Textual Reasoning

DB-GPT: Empowering Database Interactions with Private Large Language Models