Transductive Instance Transfer Learning for Cross-Language Defect Prediction

Rahul Chopra,Shreoshi Roy,Ruchika Malhotra
DOI: https://doi.org/10.1145/3512353.3512379
2022-01-14
Abstract:Predicting defects (bugs) is critical to increasing software quality. Many software defect prediction algorithms have been presented, and many of them have shown to be effective in practice. However, because existing works are largely limited to a single project, their effectiveness in predicting cross-project defects is usually poor. This is primarily due to the issue of class imbalance and discrepancies in feature distribution between the source and destination projects. However, because of the disparities in distribution amongst datasets from different studies, developing high-quality Cross Project Defect Prediction (CPDP) models remains a difficulty. In our study, instead of collecting data from a single project, we have collected source code from multiple code submissions on a programming contest website and employed Natural Language Processing (NLP) models to detect software defects in them.
What problem does this paper attempt to address?