Instruct, Not Assist: LLM-based Multi-Turn Planning and Hierarchical Questioning for Socratic Code Debugging

Priyanka Kargupta,Ishika Agarwal,Dilek Hakkani-Tur,Jiawei Han
2024-08-20
Abstract:Socratic questioning is an effective teaching strategy, encouraging critical thinking and problem-solving. The conversational capabilities of large language models (LLMs) show great potential for providing scalable, real-time student guidance. However, current LLMs often give away solutions directly, making them ineffective instructors. We tackle this issue in the code debugging domain with TreeInstruct, an Instructor agent guided by a novel state space-based planning algorithm. TreeInstruct asks probing questions to help students independently identify and resolve errors. It estimates a student's conceptual and syntactical knowledge to dynamically construct a question tree based on their responses and current knowledge state, effectively addressing both independent and dependent mistakes concurrently in a multi-turn interaction setting. In addition to using an existing single-bug debugging benchmark, we construct a more challenging multi-bug dataset of 150 coding problems, incorrect solutions, and bug fixes -- all carefully constructed and annotated by experts. Extensive evaluation shows TreeInstruct's state-of-the-art performance on both datasets, proving it to be a more effective instructor than baselines. Furthermore, a real-world case study with five students of varying skill levels further demonstrates TreeInstruct's ability to guide students to debug their code efficiently with minimal turns and highly Socratic questioning. We provide our code and datasets at <a class="link-external link-http" href="http://github.com/agarwalishika/TreeInstruct" rel="external noopener nofollow">this http URL</a> .
Computation and Language,Multiagent Systems
What problem does this paper attempt to address?
This paper attempts to address the problem of how to effectively use large language models (LLMs) for Socratic teaching in the field of code debugging. Specifically, existing LLMs typically provide direct solutions, which makes them less effective as guides in educational settings. The authors propose a new method called TreeInstruct, which uses state space estimation and tree-based Socratic questioning to guide students in independently identifying and resolving code errors. ### Main Issues 1. **Problems with Existing LLMs**: - Existing LLMs tend to provide direct solutions rather than guiding students to think. - This direct answer-giving approach does not align with the principles of Socratic teaching and fails to effectively promote students' critical thinking and problem-solving skills. 2. **Challenges in Code Debugging**: - Code debugging often involves multiple conceptual and syntactic errors that may be interdependent. - Existing methods typically assume single-round feedback, ignoring the sub-steps required for students to understand each error. ### Solution - **TreeInstruct**: - **State Space Estimation**: By estimating the student's current knowledge state, dynamically construct a problem tree to guide the student in solving problems step by step. - **Tree-based Socratic Questioning**: Generate multi-round guiding questions based on the student's responses and current knowledge state. - **Adaptive Dialogue Reconstruction**: Dynamically adjust the dialogue plan, including questions and teaching actions, based on the student's progress in the conversation. ### Experimental Validation - **Datasets**: - Experiments were conducted using existing single-error debugging benchmark datasets and newly constructed multi-error datasets. - **Baseline Methods**: - Including simple LLM baseline (Vanilla) and BRIDGE method. - **Evaluation Metrics**: - Quantitative metrics: success rate of error correction, average number of dialogue rounds. - Qualitative metrics: relevance, indirectness, and logical coherence of questions. ### Contributions - **Innovations**: - First exploration of state space estimation and dynamic tree-structured Socratic questioning. - Constructed a challenging multi-error debugging dataset. - Extensive experiments validated the effectiveness of TreeInstruct, particularly outperforming baseline methods in multi-error debugging tasks. ### Conclusion TreeInstruct effectively guides students to independently resolve code errors through dynamic generation and adjustment of problem trees, aligning with the principles of Socratic teaching and significantly improving students' code debugging abilities.