Abstract:Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV's main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation and a method by Tibshirani and Tibshirani, BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based hypothesis test we stop training of models on new folds of statistically-significantly inferior configurations. We name the method Bootstrap Corrected with Early Dropping CV (BCED-CV) that is both efficient and provides accurate performance estimates.

What problem does this paper attempt to address?

The paper primarily addresses the issue of optimistic bias in cross-validation (CV) during hyperparameter optimization in machine learning and proposes an efficient and accurate cross-validation method. The paper points out that in the process of selecting the optimal combination of algorithms and hyperparameter configurations (referred to as configurations) to generate the final predictive model, and in estimating the predictive performance of that model, cross-validation and other out-of-sample performance evaluation protocols are commonly used. However, the performance of the best configuration obtained through cross-validation often exhibits optimistic bias. To solve this problem, the authors propose an effective bootstrap method called Bootstrap Bias Corrected CV (BBC-CV), which can correct this bias. The main idea of BBC-CV is to perform bootstrap sampling on the entire process, i.e., to perform bootstrap sampling on the out-of-sample predictions for each configuration without the need for additional model training. Compared to other alternative methods, such as Nested Cross-Validation (NCV) and another method proposed by Tibshirani & Tibshirani, BBC-CV has advantages in computational efficiency, variance, and bias, and is applicable to various performance metrics (such as accuracy, AUC, concordance index, mean squared error). In addition, the paper also accelerates the cross-validation process by utilizing the idea of bootstrap sampling. Specifically, by using bootstrap-based hypothesis testing to stop further model training for statistically significantly worse configurations, a method called Bootstrap Corrected with Early Dropping CV (BCED-CV) is proposed, which is both efficient and provides accurate performance estimates. In summary, this study aims to improve the accuracy and efficiency of model performance estimation by proposing a new cross-validation method, especially when conducting large-scale hyperparameter searches.

Bootstrapping the Out-of-sample Predictions for Efficient and Accurate Cross-Validation

Bootstrapping the Cross-Validation Estimate

Improvements on cross-validation: the 632+ bootstrap method

Cross-validation: what does it estimate and how well does it do it?

Is Cross-Validation the Gold Standard to Evaluate Model Performance?

Performance-Estimation Properties of Cross-Validation-Based Protocols with Simultaneous Hyper-Parameter Optimization

Bootstrap Cross-validation Improves Model Selection in Pharmacometrics

Cross-validation and the bootstrap: Estimating the error rate of a prediction rule

Subsampling Bias and The Best-Discrepancy Systematic Cross Validation

On The Smoothness of Cross-Validation-Based Estimators Of Classifier Performance

Iterative Approximate Cross-Validation

Don't Waste Your Time: Early Stopping Cross-Validation

Blocked Cross-Validation: A Precise and Efficient Method for Hyperparameter Tuning

On Neighbourhood Cross Validation

Fast Cross-Validation Algorithms for Least Squares Support Vector Machine and Kernel Ridge Regression

Penalized Regression Methods With Modified Cross‐Validation and Bootstrap Tuning Produce Better Prediction Models

Fast and Informative Model Selection using Learning Curve Cross-Validation

A scalable bootstrap for massive data

Efficient, adaptive cross-validation for tuning and comparing models, with application to drug discovery

General Approximate Cross Validation for Model Selection

Fast Cross-Validation for Kernel-Based Algorithms.