How Do Analysts Understand and Verify AI-Assisted Data Analyses?

Ken Gu,Ruoxi Shang,Tim Althoff,Chenglong Wang,Steven M. Drucker
2024-03-05
Abstract:Data analysis is challenging as it requires synthesizing domain knowledge, statistical expertise, and programming skills. Assistants powered by large language models (LLMs), such as ChatGPT, can assist analysts by translating natural language instructions into code. However, AI-assistant responses and analysis code can be misaligned with the analyst's intent or be seemingly correct but lead to incorrect conclusions. Therefore, validating AI assistance is crucial and challenging. Here, we explore how analysts understand and verify the correctness of AI-generated analyses. To observe analysts in diverse verification approaches, we develop a design probe equipped with natural language explanations, code, visualizations, and interactive data tables with common data operations. Through a qualitative user study (n=22) using this probe, we uncover common behaviors within verification workflows and how analysts' programming, analysis, and tool backgrounds reflect these behaviors. Additionally, we provide recommendations for analysts and highlight opportunities for designers to improve future AI-assistant experiences.
Human-Computer Interaction
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: how to understand and verify the correctness of data analysis results assisted by AI assistants. With the development of large - language models (LLMs), AI assistants can convert natural - language instructions into code, thus helping data analysts execute and automate their data analysis tasks. However, the answers of AI assistants and the generated analysis code may be inconsistent with the analysts' intentions, or seem correct but actually lead to wrong conclusions. Therefore, verifying the correctness and reliability of AI - assisted analysis has become crucial and challenging. Specifically, the paper explores this problem through the following aspects: 1. **Research Background**: - Data analysis is a complex task that requires the combination of domain knowledge, statistical expertise, and programming skills. - AI assistants such as ChatGPT can simplify the data analysis process through natural - language processing, but their outputs may have misunderstandings or errors. 2. **Research Objectives**: - Explore how analysts understand and verify the analysis results generated by AI. - Through qualitative user research (n = 22), observe the behavior patterns of analysts with different backgrounds when verifying AI - assisted analysis. - Provide improvement suggestions to enhance the design of future AI assistants, enabling analysts to more effectively evaluate the analysis results generated by AI. 3. **Research Methods**: - Develop a design probe, including natural - language explanations, code, visualizations, and interactive data tables, to support different verification needs of analysts. - Through qualitative research methods, observe the specific behaviors of analysts when verifying AI - generated analysis, and analyze the relationship between these behaviors and the analysts' backgrounds. 4. **Main Findings**: - Analysts usually start with program - oriented behaviors ("What did the AI do?"), and then turn to data - oriented behaviors ("Does the result data make sense?"). - Data artifacts (such as data tables and summary visualizations) and program artifacts (such as natural - language explanations and code comments) complement each other in the verification process and jointly help analysts understand the AI's analysis process. 5. **Contributions**: - Reveal the common behavior patterns of analysts when verifying AI - generated analysis. - Propose improvement suggestions for end - user analysts and tool developers, aiming to improve the reliability and verifiability of AI - assisted data analysis. In conclusion, through in - depth research on how analysts understand and verify the results of AI - assisted data analysis, this paper reveals the challenges of current AI assistants in practical applications and provides valuable insights and suggestions for the design of future AI assistants.