Employing CNN with Spatial Pyramid Pooling for Predicting Software Defects Through Image Analysis

Zong-Yi Chen,Chin-Yu Huang,Jing-Rong Lin,Chih-Chiang Fang,William Cheng-Chung Chu
DOI: https://doi.org/10.1109/qrs62785.2024.00039
2024-01-01
Abstract:Software defect prediction (SDP) is an essential technique for identifying potential defects in software projects. Generally, SDP is mainly divided into two procedures: extracting features from source code and building a classification model using machine learning methods. However, SDP contends with specific limitations. For example, machine learning models require a fixed input size, but the size of each program is mostly inconsistent. Another limitation is that the amount of training data may be too small for machine learning, and it is extremely difficult to handle class imbalance and dataset expansion. In this study, we propose a method called spatial pyramid pooling for defect prediction (SPP-DP) that first converts all the source files into images, each of which will generate multiple images with different lengths and widths, to address the limitations of class imbalance handling and data augmentation. Second, we input these images into a convolutional neural network (CNN) to build a classifier to predict software defects. We added a spatial pyramid pooling layer (SPP-Layer) architecture to the CNN to relax the limitation of the fixed input size. Compared with different deep learning-based techniques on five datasets, the experimental results show that our proposed SPP-DP is effective, as it can balance the dataset and provide better software defect identification ability.
What problem does this paper attempt to address?