An Empirical Study of the Impact of Test Strategies on Online Optimization for Ensemble-Learning Defect Prediction

Kensei Hamamoto,Masateru Tsunoda,Amjed Tahir,Kwabena Ebo Bennin,Akito Monden,Koji Toda,Keitaro Nakasai,Kenichi Matsumoto
2024-09-10
Abstract:Ensemble learning methods have been used to enhance the reliability of defect prediction models. However, there is an inconclusive stability of a single method attaining the highest accuracy among various software projects. This work aims to improve the performance of ensemble-learning defect prediction among such projects by helping select the highest accuracy ensemble methods. We employ bandit algorithms (BA), an online optimization method, to select the highest-accuracy ensemble method. Each software module is tested sequentially, and bandit algorithms utilize the test outcomes of the modules to evaluate the performance of the ensemble learning methods. The test strategy followed might impact the testing effort and prediction accuracy when applying online optimization. Hence, we analyzed the test order's influence on BA's performance. In our experiment, we used six popular defect prediction datasets, four ensemble learning methods such as bagging, and three test strategies such as testing positive-prediction modules first (PF). Our results show that when BA is applied with PF, the prediction accuracy improved on average, and the number of found defects increased by 7% on a minimum of five out of six datasets (although with a slight increase in the testing effort by about 4% from ordinal ensemble learning). Hence, BA with PF strategy is the most effective to attain the highest prediction accuracy using ensemble methods on various projects.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to improve the accuracy of software defect prediction by selecting the optimal ensemble learning method, and when applying online optimization (based on the multi - armed bandit algorithm, i.e., Bandit Algorithms, BA), consider the impact of different testing strategies on prediction accuracy and testing effort. Specifically, the author focuses on: 1. **Selection of Ensemble Learning Methods**: - Ensemble learning methods (such as bagging, boosting, stacking, etc.) have large performance differences in different projects, and it is difficult to determine which method can achieve the highest accuracy in all projects. Therefore, a mechanism is needed to dynamically select the ensemble learning method that is most suitable for the current project. 2. **Impact of Testing Strategies**: - Different testing strategies (such as preferentially testing larger modules, smaller modules, or modules predicted to be defective) may affect the performance of BA, and thus affect the final prediction accuracy and testing effort. Therefore, it is necessary to study the specific impact of these testing strategies on the performance of BA. 3. **Application of Online Optimization**: - Use BA for online optimization, and select the optimal ensemble learning method through step - by - step testing and feedback. This involves how to use the test results to evaluate and adjust the selection of ensemble learning methods. ### Main Objectives of the Paper - **Improve the Prediction Accuracy of Ensemble Learning Methods**: By applying BA, stably select the ensemble learning method with the best performance in multiple software projects. - **Optimize Testing Strategies**: Analyze the impact of different testing strategies (such as LF, SF, PF) on the performance of BA, and find the best strategy that can improve prediction accuracy without significantly increasing the testing effort. ### Main Findings - **Impact of Testing Strategies on Accuracy and Effort**: The PF (preferentially test modules predicted to be defective) strategy can provide the highest prediction accuracy in most cases, but may slightly increase the testing effort. - **Effectiveness of BA**: Even when the testing effort is low (such as a testing effort ratio of 0.1), the PF strategy can still significantly improve prediction accuracy. - **Overall Performance Improvement**: BA can stably improve the prediction accuracy of ensemble learning methods, although it will slightly increase the testing effort. ### Conclusion The paper experimentally verifies that BA combined with the PF strategy can effectively improve the accuracy of software defect prediction and has high feasibility in practical applications. Future research will further consider factors such as module complexity to enhance the reliability of the results.