A Principled Evaluation of Ensembles of Learning Machines for Software Effort Estimation

Leandro L. Minku,Xin Yao
DOI: https://doi.org/10.1145/2020390.2020399
2011-01-01
Abstract:Background: Software effort estimation (SEE) is a task of strategic importance in software management. Recently, some studies have attempted to use ensembles of learning machines for this task. Aims: We aim at (1) evaluating whether readily available ensemble methods generally improve SEE given by single learning machines and which of them would be more useful; getting insight on (2) how to improve SEE; and (3) how to choose machine learning (ML) models for SEE. Method: A principled and comprehensive statistical comparison of three ensemble methods and three single learners was carried out using thirteen data sets. Feature selection and ensemble diversity analyses were performed to gain insight on how to improve SEE based on the approaches singled out. In addition, a risk analysis was performed to investigate the robustness to outliers. Therefore, the better understanding/insight provided by the paper is based on principled experiments, not just an intuition or speculation. Results: None of the compared methods is consistently the best, even though regression trees and bagging using multilayer perceptrons (MLPs) are more frequently among the best. These two approaches usually perform similarly. Regression trees place more important features in higher levels of the trees, suggesting that feature weights are important when using ML models for SEE. The analysis of bagging with MLPs suggests that a self-tuning ensemble diversity method may help improving SEE. Conclusions: Ideally, principled experiments should be done in an individual basis to choose a model. If an organisation has no resources for that, regression trees seem to be a good choice for its simplicity. The analysis also suggests approaches to improve SEE.
What problem does this paper attempt to address?