Abstract:Bootstrap is a popular methodology for simulating input uncertainty. However, it can be computationally expensive when the number of samples is large. We propose a new approach called \textbf{Orthogonal Bootstrap} that reduces the number of required Monte Carlo replications. We decomposes the target being simulated into two parts: the \textit{non-orthogonal part} which has a closed-form result known as Infinitesimal Jackknife and the \textit{orthogonal part} which is easier to be simulated. We theoretically and numerically show that Orthogonal Bootstrap significantly reduces the computational cost of Bootstrap while improving empirical accuracy and maintaining the same width of the constructed interval.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the high computational cost when using the Bootstrap method for input uncertainty simulation on large - scale data sets. Specifically, the Bootstrap method estimates statistical uncertainty through resampling. However, when the number of samples is large, a large number of Monte Carlo repetitions are required, which leads to a huge computational overhead. To solve this problem, the paper proposes a new method named Orthogonal Bootstrap, aiming to reduce the required number of Monte Carlo repetitions, thereby reducing the computational cost, while improving the empirical accuracy and keeping the width of the constructed interval unchanged. ### Key points: 1. **Problem background**: - **Input uncertainty**: In data - driven analysis, the statistical noise propagated from the data model to the subsequent output analysis affects the accuracy and reliability of the results. - **Bootstrap method**: Bootstrap is a non - parametric method that estimates this uncertainty through random resampling with replacement. However, when the number of samples is large, the computational cost of the Bootstrap method is very high. 2. **Proposed method**: - **Orthogonal Bootstrap**: This method reduces the computational cost by decomposing the target into two parts: - **Non - orthogonal part**: This part has a closed - form solution, called Infinitesimal Jackknife. - **Orthogonal part**: This part is easier to simulate. - By dealing with these two parts separately, Orthogonal Bootstrap significantly reduces the computational cost while improving the empirical accuracy. 3. **Theoretical and empirical results**: - **Theoretical results**: The paper proves that under the assumption that the performance metric has a continuous Fréchet derivative under the Kernel Maximum Mean Discrepancy (MMD) distance, Orthogonal Bootstrap can reduce the required number of Monte Carlo repetitions from \( \Omega(n) \) to \( O(1) \). - **Empirical results**: The paper shows significant improvements of Orthogonal Bootstrap on simulated and real - world data sets through numerical experiments, especially when the number of Monte Carlo repetitions is limited. 4. **Comparison with existing methods**: - **Standard Bootstrap**: When the number of Monte Carlo repetitions is small, the coverage probability of the standard Bootstrap method is significantly lower than expected. - **Cheap Bootstrap**: Although Cheap Bootstrap can provide a similar coverage probability, the average width of its confidence interval is longer. - **Orthogonal Bootstrap**: When the number of Monte Carlo repetitions is limited, Orthogonal Bootstrap not only provides a higher coverage probability but also achieves the same confidence interval width as the standard Bootstrap. ### Summary: The paper proposes a new Bootstrap method - Orthogonal Bootstrap, aiming to solve the problem of high computational cost of the Bootstrap method on large - scale data sets. By decomposing the target into non - orthogonal and orthogonal parts, this method reduces the computational cost while improving the accuracy and reliability of the estimates.

Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

A Shrinkage Approach to Improve Direct Bootstrap Resampling Under Input Uncertainty

Cluster-Robust Bootstrap Inference in Quantile Regression Models

Application of Bootstrap Estimators in Simulation Experiment

A Higher-Order Swiss Army Infinitesimal Jackknife

Stratified sampling and bootstrapping for approximate Bayesian computation

Treatment bootstrapping: A new approach to quantify uncertainty of average treatment effect estimates

The Bootstrap for Dynamical Systems

Efficient Input Uncertainty Quantification for Ratio Estimator

Resampling Stochastic Gradient Descent Cheaply for Efficient Uncertainty Quantification

The jackknife: a resampling method with connections to the bootstrap

Uncertainty Quantification using Simulation Output: Batching as an Inferential Device

Bootstrap algorithms for small samples

Optimal Subsampling Bootstrap for Massive Data

Equivariant Bootstrapping for Uncertainty Quantification in Imaging Inverse Problems

Bootstrapping the Cross-Validation Estimate

Statistical Uncertainty Analysis for Stochastic Simulation

Bootstrap Method for Uncertainty Evaluation in Critical Dimension Small-Angle X-Ray Scattering

Bootstrap estimation of the proportion of outliers in robust regression

Estimation and Inference by Stochastic Optimization: Three Examples

Bootstrap Your Own Variance