Empirical validation of machine learning techniques for heterogeneous cross-project change prediction and within-project change prediction

Ruchika Malhotra,Shweta Meena
DOI: https://doi.org/10.1016/j.jocs.2024.102230
IF: 3.817
2024-02-11
Journal of Computational Science
Abstract:Software change prediction plays key role for maintaining software quality. Identification of change prone parts of a software in early stages helps in optimization of resources and amount of effort required for the maintenance of software. The change prediction model can be built by using the same project for training and testing, which is termed as within-project change prediction. Sometimes, the available data is not sufficient for both training and testing, in such cases, it is beneficial to use other projects for testing. Usage of two different projects for training and testing for identification of changes is termed as Cross-Project Change Prediction (CPCP). In CPCP, the data distribution of features differentiates. The scalability of prediction models can be increased by using the two different projects for training and testing with different features, which is termed as Heterogeneous Cross-Project Change Prediction (HCPCP). Thus, the usability of such prediction models increases for future projects with unseen data. In this study, the feasibility of HCPDP and WPCP analyzed for open-source projects. HCPCP used feature-type transfer learning (TL) in this study. The analysis of HCPCP and CPCP is accomplished for open-source projects to enhance software quality such as maintainability, reliability, and robustness. Prediction model is developed using various Machine Learning (ML) techniques. Performance of change prediction model analyzed using Area Under the Curve (AUC).
computer science, theory & methods, interdisciplinary applications
What problem does this paper attempt to address?