WatChat: Explaining perplexing programs by debugging mental models

Kartik Chandra,Katherine M. Collins,Will Crichton,Tony Chen,Tzu-Mao Li,Adrian Weller,Rachit Nigam,Joshua Tenenbaum,Jonathan Ragan-Kelley

2024-10-03

Abstract:Often, a good explanation for a program's unexpected behavior is a bug in the programmer's code. But sometimes, an even better explanation is a bug in the programmer's mental model of the language or API they are using. Instead of merely debugging our current code ("giving the programmer a fish"), what if our tools could directly debug our mental models ("teaching the programmer to fish")? In this paper, we apply recent ideas from computational cognitive science to offer a principled framework for doing exactly that. Given a "why?" question about a program, we automatically infer potential misconceptions about the language/API that might cause the user to be surprised by the program's behavior -- and then analyze those misconceptions to provide explanations of the program's behavior. Our key idea is to formally represent misconceptions as counterfactual (erroneous) semantics for the language/API, which can be inferred and debugged using program synthesis techniques. We demonstrate our framework, WatChat, by building systems for explanation in two domains: JavaScript type coercion, and the Git version control system. We evaluate WatChatJS and WatChatGit by comparing their outputs to experimentally-collected human-written explanations in these two domains: we show that WatChat's explanations exhibit key features of human-written explanation, unlike those of a state-of-the-art language model.

Programming Languages,Artificial Intelligence,Human-Computer Interaction

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to provide effective explanations of program behavior, especially when programmers have misunderstandings in their understanding of programming languages or APIs. Specifically, the paper proposes a framework named WatChat, which aims to explain unexpected program behavior by debugging users' mental models. WatChat can automatically infer potential misunderstandings in the mental models that may cause users to be surprised by program behavior, and analyze these misunderstandings to provide behavior explanations. The paper demonstrates the effectiveness of this framework by building systems in two areas, JavaScript type conversion and the Git version control system, and proves that the explanations provided by WatChat have similar key features to those written by humans, rather than just simple technical descriptions or error messages. This helps improve programmers' problem - solving abilities and reduce programming errors caused by misunderstandings.

WatChat: Explaining perplexing programs by debugging mental models

IntelliExplain: Enhancing Conversational Code Generation for Non-Professional Programmers

ChatDBG: An AI-Powered Debugging Assistant

An Inquisitive Code Editor for Addressing Novice Programmers' Misconceptions of Program Behavior

Bugsplainer: Leveraging Code Structures to Explain Software Bugs with Neural Machine Translation

Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

Explaining Explanation: An Empirical Study on Explanation in Code Reviews

Can Language Models Employ the Socratic Method? Experiments with Code Debugging

Explainable Automated Debugging via Large Language Model-driven Scientific Debugging

Debugging Pathways: Open-Ended Discrepancy Noticing, Causal Reasoning, and Intervening

Democratizing Chatbot Debugging: A Computational Framework for Evaluating and Explaining Inappropriate Chatbot Responses

Debugging Tests for Model Explanations

Toward Semantic Foundations for Program Editors

Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting

Chatbots As Fluent Polyglots: Revisiting Breakthrough Code Snippets

ChatGPT Inaccuracy Mitigation during Technical Report Understanding: Are We There Yet?

Unfooling Perturbation-Based Post Hoc Explainers

Concolic Metamorphic Debugging

How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging

Integrating Personalized Parsons Problems with Multi-Level Textual Explanations to Scaffold Code Writing

Scaling CS1 Support with Compiler-Integrated Conversational AI