Applying Coding Behavior Features to Student Plagiarism Detection on Programming Assignments
Zheng Li,Yuting Zhang,Yong Liu,Yonghao Wu,ShuMei Wu
DOI: https://doi.org/10.1142/s0218126623502869
2023-01-01
Journal of Circuits Systems and Computers
Abstract:In programming education, the result of plagiarism detection is a crucial criterion for assessing whether or not students can pass course exams. Recently, the prevalent methods for detecting student plagiarism have been proposed by analyzing source code. These methods extract features (such as token, abstract syntax tree and control flow graph) from the source code, examine the similarity of codes using various similarity detection methods, and then perform plagiarism detection based on a predefined plagiarism threshold. However, these previous methods for plagiarism detection have some problems. First, they are less effective in detecting code modification related to structure. Second, they require a considerable number of training data, which demand high computing time and space. Third, they cannot determine whether students plagiarize in time. We propose a novel plagiarism detection method by analyzing the behavioral features of students during the coding process. Specifically, we extract five behavioral features based on students’ programming habits. Then, we use a feature ranking-based suspiciousness algorithm to obtain the possibility of student plagiarism. Based on our proposed method, we develop the Online Integrated Programming Platform. To evaluate the accuracy of our method, we conduct a series of experiments. Final experimental results indicate that our method achieves promising results with Accuracy, Precision, Recall and [Formula: see text] values of 0.95, 0.90, 0.95 and 0.92, respectively. Finally, we also analyze the correlation between whether students plagiarized and their regular and final grades, which can further verify the effectiveness of our proposed method.