Abstract:Software defects, also referred to as software bugs, are anomalies or flaws in computer program that cause software to behave unexpectedly or produce incorrect results. These defects can manifest in various forms, including coding errors, design flaws, and logic mistakes, this defect have the potential to emerge at any stage of the software development lifecycle. Traditional prediction models usually have lower prediction performance. To address this issue, this paper proposes a novel prediction model using Hybrid Grey Wolf Optimizer and Particle Swarm Optimization (HGWOPSO). This research aims to determine whether the Hybrid Grey Wolf and Particle Swarm Optimization model could potentially improve the effectiveness of software defect prediction compared to base PSO and GWO algorithms without hybridization. Furthermore, this study aims to determine the effectiveness of different Gradient Boosting Algorithm classification algorithms when combined with HGWOPSO feature selection in predicting software defects. The study utilizes 13 NASA MDP dataset. These dataset are divided into testing and training data using 10-fold cross-validation. After data is divided, SMOTE technique is employed in training data. This technique generates synthetic samples to balance the dataset, ensuring better performance of the predictive model. Subsequently feature selection is conducted using HGWOPSO Algorithm. Each subset of the NASA MDP dataset will be processed by three boosting classification algorithms namely XGBoost, LightGBM, and CatBoost. Performance evaluation is based on the Area under the ROC Curve (AUC) value. Average AUC values yielded by HGWOPSO XGBoost, HGWOPSO LightGBM, and HGWOPSO CatBoost are 0.891, 0.881, and 0.894, respectively. Results of this study indicated that utilizing the HGWOPSO algorithm improved AUC performance compared to the base GWO and PSO algorithms. Specifically, HGWOPSO CatBoost achieved the highest AUC of 0.894. This represents a 6.5% increase in AUC with a significance value of 0.00552 compared to PSO CatBoost, and a 6.3% AUC increase with a significance value of 0.00148 compared to GWO CatBoost. This study demonstrated that HGWOPSO significantly improves the performance of software defect prediction. The implication of this research is to enhance software defect prediction models by incorporating hybrid optimization techniques and combining them with gradient boosting algorithms, which can potentially identify and address defects more accurately

Comparative Study of Various Hyperparameter Tuning on Random Forest Classification With SMOTE and Feature Selection Using Genetic Algorithm in Software Defect Prediction

Optimizing Software Defect Prediction Models: Integrating Hybrid Grey Wolf and Particle Swarm Optimization for Enhanced Feature Selection with Popular Gradient Boosting Algorithm

Feature Selection Using Firefly Algorithm With Tree-Based Classification In Software Defect Prediction

Improving Software Defect Prediction With a Combination of Feature Selection Based On Ant Colony Optimization and Ensemble Technique

A systematic review of hyperparameter tuning techniques for software quality prediction models

The accuracy of machine learning models relies on hyperparameter tuning: student result classification using random forest, randomized search, grid search, bayesian, genetic, and optuna algorithms

Cross‐project defect prediction method based on genetic algorithm feature selection

Performance evaluation of software defect prediction with NASA dataset using machine learning techniques

CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests

A New Improved Prediction of Software Defects Using Machine Learning-based Boosting Techniques with NASA Dataset

Improving with Hybrid Feature Selection in Software Defect Prediction

A feature selection model for software defect prediction using binary Rao optimization algorithm

Software Defect Prediction Model Using AdaBoost based Random Forest Technique

Improved mayfly optimization deep stacked sparse auto encoder feature selection scorched gradient descent driven dropout XLM learning framework for software defect prediction

A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB

Hyperparameter Tuning Algorithm Comparison with Machine Learning Algorithms

Impact of Parameter Tuning for Optimizing Deep Neural Network Models for Predicting Software Faults

The impact of the distance metric and measure on SMOTE-based techniques in software defect prediction

Hybrid Optimization-Based Neural Network Classifier for Software Defect Prediction

Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive Explanations

Hybridization of fuzzy rough feature selection with ANFIS and turbulent flow of water optimization for managing software defect prediction uncertainty