Abstract:Software engineering workflows use version control systems to track changes and handle merge cases from multiple contributors. This has introduced challenges to testing because it is impractical to test whole codebases to ensure each change is defect-free, and it is not enough to test changed files alone. Just-in-time software defect prediction (JIT-SDP) systems have been proposed to solve this by predicting the likelihood that a code change is defective. Numerous techniques have been studied to build such JIT software defect prediction models, but the power of pre-trained code transformer language models in this task has been underexplored. These models have achieved human-level performance in code understanding and software engineering tasks. Inspired by that, we modeled the problem of change defect prediction as a text classification task utilizing these pre-trained models. We have investigated this idea on a recently published dataset, ApacheJIT, consisting of 44k commits. We concatenated the changed lines in each commit as one string and augmented it with the commit message and static code metrics. Parameter-efficient fine-tuning was performed for 4 chosen pre-trained models, JavaBERT, CodeBERT, CodeT5, and CodeReviewer, with either partially frozen layers or low-rank adaptation (LoRA). Additionally, experiments with the Local, Sparse, and Global (LSG) attention variants were conducted to handle long commits efficiently, which reduces memory consumption. As far as the authors are aware, this is the first investigation into the abilities of pre-trained code models to detect defective changes in the ApacheJIT dataset. Our results show that proper fine-tuning improves the defect prediction performance of the chosen models in the F 1 scores. CodeBERT and CodeReviewer achieved a 10% and 12% increase in the F 1 score over the best baseline models, JITGNN and JITLine, when commit messages and code metrics are included. Our approach sheds more light on the abilities of language models in software engineering tasks, promoting their use in production environments and ensuring that deployed software is defect-free efficiently.

Commit Artifact Preserving Build Prediction

Unifying Defect Prediction, Categorization, and Repair by Multi-Task Deep Learning

Automated Commit Intelligence by Pre-training

CommitBART: A Large Pre-trained Model for GitHub Commits

BuildFast

Could We Predict the Result of a Continuous Integration Build? An Empirical Study

Build Predictor: More Accurate Missed Dependency Prediction in Build Configuration Files

What Are the Factors Impacting Build Breakage?

Does Socio-Technical Congruence Have an Effect on Continuous Integration Build Failures? An Empirical Study on 10 GitHub Projects

Parameter-efficient fine-tuning of pre-trained code models for just-in-time defect prediction

Cutting the Software Building Efforts in Continuous Integration by Semi-Supervised Online AUC Optimization.

The Why, When, What, and How about Predictive Continuous Integration: A Simulation-Based Investigation

Software Defect Prediction via Transformer

Context-aware Retrieval-based Deep Commit Message Generation

Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time Defect Prediction

Just-In-Time Software Defect Prediction via Bi-modal Change Representation Learning

Commit Message Generation for Source Code Changes.

Statically Verifying Continuous Integration Configurations

JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction

RAG-Enhanced Commit Message Generation

GIT: A Generative Image-to-text Transformer for Vision and Language