Abstract:Programming education at scale increasingly relies on automated feedback to help students learn to program. An important form of feedback is to point out semantic errors in student programs and provide hints for program repair. Such automated feedback depends essentially on solving the tasks of classification, localization and repair of semantic errors. Although there are datasets for the tasks, we observe that they do not have the annotations supporting all three tasks. As such, existing approaches for semantic error feedback treat error classification, localization and repair as independent tasks, resulting in sub-optimal performance on each task. Moreover, existing datasets either contain few programming assignments or have few programs for each assignment. Therefore, existing approaches often leverage rule-based methods and evaluate them with a small number of programming assignments. To tackle the problems, we first describe the creation of a new dataset COJ2022 that contains 5,914 C programs with semantic errors submitted to 498 different assignments in an introductory programming course, where each program is annotated with the error types and locations and is coupled with the repaired program submitted by the same student. We show the advantages of COJ2022 over existing datasets on various aspects. Second, we treat semantic error classification, localization and repair as dependent tasks, and propose a novel two-stage method ErrorCLR to solve them. Specifically, in the first stage we train a model based on graph matching networks to jointly classify and localize potential semantic errors in student programs, and in the second stage we mask error spans in buggy programs using information of error types and locations and train a CodeT5 model to predict correct spans. The predicted spans replace the error spans to form repaired programs. Experimental results show that ErrorCLR remarkably outperforms the comparative methods for all three tasks on COJ2022 and other public datasets. We also conduct a case study to visualize and interpret what is learned by the graph matching network in ErrorCLR. We have released the source code and COJ2022 at https://github.com/DaSESmartEdu/ErrorCLR.

ErrorCLR: Semantic Error Classification, Localization and Repair for Introductory Programming Assignments

Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments

Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks

Automated Feedback Generation for Competition-Level Code

CSEC: A Chinese Semantic Error Correction Dataset for Written Correction.

CLACER: A Deep Learning-based Compilation Error Classification Method for Novice Students' Programs

Graph-based, Self-Supervised Program Repair from Diagnostic Feedback

Logic Error Localization in Student Programming Assignments Using Pseudocode and Graph Neural Networks

Data-Driven Feedback Generation for Introductory Programming Exercises

Flexible control flow graph alignment for delivering data-driven feedback to novice programming learners

CSED: A Chinese Semantic Error Diagnosis Corpus

SCG_FBS - A Code Grading Model for Students' Program in Programming Education.

Jointly Learning to Repair Code and Generate Commit Message

Benchmarking Educational Program Repair

GGF: A Graph-based Method for Programming Language Syntax Error Correction

Improving Pre-trained Language Models with Syntactic Dependency Prediction Task for Chinese Semantic Error Recognition

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Semantic Code Repair using Neuro-Symbolic Transformation Networks

Repairing Bugs in Python Assignments Using Large Language Models

SeSICL: Semantic and Structural Integrated Contrastive Learning for Knowledge Graph Error Detection