Test Case-Informed Knowledge Tracing for Open-ended Coding Tasks

Zhangqi Duan,Nigel Fernandez,Alexander Hicks,Andrew Lan
2024-09-28
Abstract:Open-ended coding tasks, which ask students to construct programs according to certain specifications, are common in computer science education. Student modeling can be challenging since their open-ended nature means that student code can be diverse. Traditional knowledge tracing (KT) models that only analyze response correctness may not fully capture nuances in student knowledge from student code. In this paper, we introduce Test case-Informed Knowledge Tracing for Open-ended Coding (TIKTOC), a framework to simultaneously analyze and predict both open-ended student code and whether the code passes each test case. We augment the existing CodeWorkout dataset with the test cases used for a subset of the open-ended coding questions, and propose a multi-task learning KT method to simultaneously analyze and predict 1) whether a student's code submission passes each test case and 2) the student's open-ended code, using a large language model as the backbone. We quantitatively show that these methods outperform existing KT methods for coding that only use the overall score a code submission receives. We also qualitatively demonstrate how test case information, combined with open-ended code, helps us gain fine-grained insights into student knowledge.
Computers and Society,Computation and Language,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of student modeling in open - ended coding tasks in computer science education. Specifically, the paper proposes a framework named **Test case - Informed Knowledge Tracing for Open - ended Coding (TIKTOC)** to simultaneously analyze and predict whether students' code passes each test case and students' open - ended code. #### Background and problem description 1. **Challenges of open - ended programming tasks**: - Open - ended programming tasks require students to write programs according to specific specifications, and such tasks are very common in computer science education. - The code written by students is diverse, and the traditional Knowledge Tracing (KT) model only analyzes the correctness of responses and may not be able to fully capture the knowledge details shown by students from the code. 2. **Limitations of existing methods**: - Existing KT models mainly rely on binary correctness, that is, whether the code passes all test cases to evaluate the correctness of students' code. - This evaluation method lacks fine - grained insight into specific errors or misunderstandings in students' code. 3. **Importance of introducing test cases**: - Test cases can identify syntax, logic, and runtime errors in students' code and provide more detailed feedback. - By analyzing whether students' code passes each test case, we can better understand students' programming ability and error types. #### New methods proposed in the paper - **TIKTOC framework**: Combine the multi - task learning method to simultaneously predict whether students' code passes each test case and generate students' open - ended code. - **Dataset enhancement**: Expand the existing CodeWorkout dataset and add test cases for some problems to more comprehensively evaluate students' performance. - **Model innovation**: Use a large - language model (LLM) as the backbone, which can more accurately predict students' code and conduct fine - grained student knowledge analysis through the results of passing test cases. #### Main contributions 1. **Define a new KT task**: Analyze and predict whether students' code passes each test case, which is more challenging than the traditional overall correctness evaluation. 2. **Dataset release**: Publicly release the enhanced CodeWorkout dataset to provide a benchmark for future research. 3. **Model improvement**: Significantly improve the accuracy of test case pass / fail prediction (AUC increased by 15%) and code prediction accuracy (CodeBLEU increased by 6%) through the multi - task learning method. 4. **Application prospects**: Demonstrate how the new KT task and method can provide more in - depth and fine - grained student knowledge insights for computer science education, and discuss their potential applications and limitations. ### Summary This paper proposes a more refined knowledge tracing task by introducing test case information, aiming to solve the problem of insufficient evaluation of students' knowledge states by traditional KT models in open - ended programming tasks. Through the application of multi - task learning and large - language models, TIKTOC not only improves prediction accuracy but also provides educators with more abundant student code analysis tools.