Abstract:Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV's main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation and a method by Tibshirani and Tibshirani, BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based hypothesis test we stop training of models on new folds of statistically-significantly inferior configurations. We name the method Bootstrap Corrected with Early Dropping CV (BCED-CV) that is both efficient and provides accurate performance estimates.
What problem does this paper attempt to address?
The paper primarily addresses the issue of optimistic bias in cross-validation (CV) during hyperparameter optimization in machine learning and proposes an efficient and accurate cross-validation method.
The paper points out that in the process of selecting the optimal combination of algorithms and hyperparameter configurations (referred to as configurations) to generate the final predictive model, and in estimating the predictive performance of that model, cross-validation and other out-of-sample performance evaluation protocols are commonly used. However, the performance of the best configuration obtained through cross-validation often exhibits optimistic bias. To solve this problem, the authors propose an effective bootstrap method called Bootstrap Bias Corrected CV (BBC-CV), which can correct this bias.
The main idea of BBC-CV is to perform bootstrap sampling on the entire process, i.e., to perform bootstrap sampling on the out-of-sample predictions for each configuration without the need for additional model training. Compared to other alternative methods, such as Nested Cross-Validation (NCV) and another method proposed by Tibshirani & Tibshirani, BBC-CV has advantages in computational efficiency, variance, and bias, and is applicable to various performance metrics (such as accuracy, AUC, concordance index, mean squared error).
In addition, the paper also accelerates the cross-validation process by utilizing the idea of bootstrap sampling. Specifically, by using bootstrap-based hypothesis testing to stop further model training for statistically significantly worse configurations, a method called Bootstrap Corrected with Early Dropping CV (BCED-CV) is proposed, which is both efficient and provides accurate performance estimates.
In summary, this study aims to improve the accuracy and efficiency of model performance estimation by proposing a new cross-validation method, especially when conducting large-scale hyperparameter searches.