WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Liwenhan Xie,Chengbo Zheng,Haijun Xia,Huamin Qu,Chen Zhu-Tian
DOI: https://doi.org/10.1145/3654777.3676374
2024-08-03
Abstract:Large language models (LLMs) support data analysis through conversational user interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing diverse analysis tasks. However, presenting raw code can obscure the logic and hinder user verification. To empower users with enhanced comprehension and augmented control over analysis conducted by LLMs, we propose a novel approach to transform LLM-generated code into an interactive visual representation. In the approach, users are provided with a clear, step-by-step visualization of the LLM-generated code in real time, allowing them to understand, verify, and modify individual data operations in the analysis. Our design decisions are informed by a formative study (N=8) probing into user practice and challenges. We further developed a prototype named WaitGPT and conducted a user study (N=12) to evaluate its usability and effectiveness. The findings from the user study reveal that WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling participants to enhance error detection and increase their overall confidence in the results.
Human-Computer Interaction
What problem does this paper attempt to address?
This paper aims to address the issues of code generated by large language models (LLMs) being difficult to understand and verify during data analysis, and proposes a new method to enhance users' monitoring and control over LLM-generated data analysis scripts. Specifically, the study identifies the following problems with current LLM tools in data analysis: 1. **Code is difficult to understand**: Directly displaying raw code makes it hard for users to grasp the logic, especially when their programming skills are limited. 2. **Reliability issues**: The code generated by LLMs may contain errors or misunderstand the user's intent, requiring users to check and correct it. 3. **Low interaction efficiency**: Modifying code through dialogue can become very cumbersome and inefficient. To solve these problems, the paper proposes a new system called WaitGPT, which transforms LLM-generated code into a visual and interactive form. This allows users to view the data operation process in real-time, thereby better understanding, verifying, and adjusting each step. The main features of WaitGPT include: - **Real-time visualization**: Dynamically generates visual representations of the code, enabling users to intuitively see each data operation and its results. - **Interactive modification**: Users can directly interact with visual elements to modify data operations without needing to regenerate the entire code. - **Step-by-step execution**: Executes the code line by line and updates visual charts, displaying intermediate states, making it easier for users to intervene at any time. In this way, WaitGPT not only improves users' understanding and confidence in the data analysis process but also enhances their ability to detect and correct errors.