SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-Explanation During Source Code Comprehension

Jeevan Chapagain,Zak Risha,Rabin Banjade,Priti Oli,Lasang Tamang,Peter Brusilovsky,Vasile RUs
DOI: https://doi.org/10.32473/flairs.36.133385
2023-05-08
The International FLAIRS Conference Proceedings
Abstract:The ability to automatically assess learners' activities is the key to user modeling and personalization in adaptive educational systems.The work presented in this paper opens an opportunity to expand the scope of automated assessment from traditional programming problems to code comprehension tasks where students are requested to explain the critical steps of a program. The ability to automatically assess these self-explanations offers a unique opportunity to understand the current state of student knowledge, recognize possible misconceptions, and provide feedback. Annotated datasets are needed to train Artificial Intelligence/Machine Learning approaches for the automated assessment of student explanations. To answer this need, we present a novel corpus called SelfCode which consists of 1,770 sentence pairs of student and expert self-explanations of Java code examples, along with semantic similarity judgments provided by experts. We also present a baseline automated assessment model that relies on textual features. The corpus is available at the GitHub repository (https://github.com/jeevanchaps/SelfCode).
What problem does this paper attempt to address?