Abstract:We propose a semantic feature enhancement‐based defect prediction framework (SFE‐DP), which enriches the training data with procedural enhancement semantic information to improve the learning capability of the model. We propose a deep model‐based defect prediction framework that adds a self‐attentive mechanism layer and a matching layer to a deep transfer learning model that uses a hybrid loss function‐based feature representation learning method for parameter optimization. We evaluated the defect prediction of SFE‐DP on 10 open‐source projects and SFE‐DP has the better prediction performance compared to traditional cross‐project defect prediction (CPDP) parties in terms of area under curve (AUC) as well as F1‐measure metrics. Summary Although cross‐project defect prediction (CPDP) techniques that use traditional manual features to build defect prediction model have been well‐developed, they usually ignore the semantic and structural information inside the program and fail to capture the hidden features that are critical for program category prediction, resulting in poor defect prediction results. Researchers have proposed using deep learning to automatically extract the semantic features of programs and fuse them with traditional features as training data. However, in practice, it is important to explore the effective representation of the semantic features in the programs and how the fusion of a reasonable ratio between the two types of features can maximize the effectiveness of the model. In this paper, we propose a semantic feature enhancement‐based defect prediction framework (SFE‐DP), which augments the semantic feature set extracted from the program code with data. We also introduce a layer of self‐attentive mechanism and a matching layer to filter low‐efficiency and non‐critical semantic features in the model structure. Finally, we combine the idea of hybrid loss function to iteratively optimize the model parameters. Extensive experiments validate that SFE‐DP can outperform the baseline approaches on 90 pairs of CPDP tasks formed by 10 open‐source projects.

Mutation‐based data augmentation for software defect prediction

A Hybrid Sampling and Multi-Objective Optimization Approach for Enhanced Software Defect Prediction

SDP-MTF: A Composite Transfer Learning and Feature Fusion for Cross-Project Software Defect Prediction

Spotting Code Mutation for Predictive Mutation Testing

UDA-DP: Unsupervised Domain Adaptation for Software Defect Prediction

Software Defect Prediction Using Deep Q‐Learning Network‐Based Feature Extraction

A novel defect prediction method based on semantic feature enhancement

Hybrid deep architecture for software defect prediction with improved feature set

Combined Classifier for Cross-Project Defect Prediction: an Extended Empirical Study.

Optimized Deeplearning Algorithm for Software Defects Prediction

An Improved Semi-Supervised Learning Method for Software Defect Prediction.

Mitigating Data Imbalance for Software Vulnerability Assessment: Does Data Augmentation Help?

Over-sampling method for tackling class imbalance in software defect prediction based on generative adversarial networks

IMDAC: A robust intelligent software defect prediction model via multi‐objective optimization and end‐to‐end hybrid deep learning networks

Performance evaluation of software defect prediction with NASA dataset using machine learning techniques

Mutation boosted salp swarm optimizer meets rough set theory: A novel approach to software defect detection

A Novel Class-Imbalance Learning Approach for Both Within-Project and Cross-Project Defect Prediction.

An Approach to Semantic and Structural Features Learning for Software Defect Prediction

A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB

An Improved Twin Support Vector Machine Based on Multi-Objective Cuckoo Search for Software Defect Prediction.

An Improved Transfer Adaptive Boosting Approach for Mixed‐project Defect Prediction