Abstract:In recent years, millions of source codes are generated in different languages on a daily basis all over the world. A deep neural network-based intelligent support model for source code completion would be a great advantage in software engineering and programming education fields. Vast numbers of syntax, logical, and other critical errors that cannot be detected by normal compilers continue to exist in source codes, and the development of an intelligent evaluation methodology that does not rely on manual compilation has become essential. Even experienced programmers often find it necessary to analyze an entire program in order to find a single error and are thus being forced to waste valuable time debugging their source codes. With this point in mind, we proposed an intelligent model that is based on long short-term memory (LSTM) and combined it with an attention mechanism for source code completion. Thus, the proposed model can detect source code errors with locations and then predict the correct words. In addition, the proposed model can classify the source codes as to whether they are erroneous or not. We trained our proposed model using the source code and then evaluated the performance. All of the data used in our experiments were extracted from Aizu Online Judge (AOJ) system. The experimental results obtained show that the accuracy in terms of error detection and prediction of our proposed model approximately is 62% and source code classification accuracy is approximately 96% which outperformed a standard LSTM and other state-of-the-art models. Moreover, in comparison to state-of-the-art models, our proposed model achieved an interesting level of success in terms of error detection, prediction, and classification when applied to long source code sequences. Overall, these experimental results indicate the usefulness of our proposed model in software engineering and programming education arena.

SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-Explanation During Source Code Comprehension

Automated Assessment of Student Self-explanation During Source Code Comprehension

Comparing Code Explanations Created by Students and Large Language Models

Automated Assessment of Students' Code Comprehension using LLMs

Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting Skills

CompCodeVet: A Compiler-guided Validation and Enhancement Approach for Code Dataset

Explaining Explanation: An Empirical Study on Explanation in Code Reviews

Let's Ask Students About Their Programs, Automatically

Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study

CodeQA: A Question Answering Dataset for Source Code Comprehension

A Neural Network Based Intelligent Support Model for Program Code Completion

INSPECT: Intrinsic and Systematic Probing Evaluation for Code Transformers

CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

How Novices Use LLM-Based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment

RLCoder: Reinforcement Learning for Repository-Level Code Completion

A Language-Agnostic Model for Semantic Source Code Labeling

CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

Context-aware Code Summary Generation

CodEval: Improving Student Success In Programming Assignments

Code Generation Based Grading: Evaluating an Auto-grading Mechanism for "Explain-in-Plain-English" Questions

An Empirical Study on Self-correcting Large Language Models for Data Science Code Generation