Abstract:We introduce a general framework for large-scale model-based derivative-free optimization based on iterative minimization within random subspaces. We present a probabilistic worst-case complexity analysis for our method, where in particular we prove high-probability bounds on the number of iterations before a given optimality is achieved. This framework is specialized to nonlinear least-squares problems, with a model-based framework based on the Gauss-Newton method. This method achieves scalability by constructing local linear interpolation models to approximate the Jacobian, and computes new steps at each iteration in a subspace with user-determined dimension. We then describe a practical implementation of this framework, which we call DFBGN. We outline efficient techniques for selecting the interpolation points and search subspace, yielding an implementation that has a low per-iteration linear algebra cost (linear in the problem dimension) while also achieving fast objective decrease as measured by evaluations. Extensive numerical results demonstrate that DFBGN has improved scalability, yielding strong performance on large-scale nonlinear least-squares problems.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the scalability of model - based methods in large - scale derivative - free optimization (DFO). Specifically, the paper presents a general framework based on iterative minimization within a random subspace for large - scale model - based derivative - free optimization. This method is particularly suitable for nonlinear least - squares problems. It approximates the Jacobian matrix by constructing a local linear interpolation model and calculates a new step within a user - determined - dimensional subspace at each iteration. The paper also provides a probabilistic worst - case complexity analysis, demonstrating a high - probability bound on the number of iterations before reaching a given optimality. In addition, the paper describes a practical implementation of this framework, called DFBGN, which shows good performance on large - scale nonlinear least - squares problems.
### Main Contributions
1. **Proposing the RSDFO Framework**: Introduced the general model - based DFO framework of Random Subspace Derivative - Free Optimization (RSDFO), which relies on building models within subspaces at each iteration. This new method enables model - based DFO methods to be applied to large - scale problems, allowing users to explicitly control the subspace dimension, thereby controlling the linear algebra cost per iteration.
2. **Specialization for Nonlinear Least - Squares Problems**: Specialized the RSDFO framework to nonlinear least - squares problems and proposed a new algorithm, RSDFO - GN (Random Subspace DFO combined with the Gauss - Newton method). The subspace model construction framework of RSDFO - GN is based on the DFO Gauss - Newton method and retains the same theoretical guarantees as RSDFO.
3. **Efficient Implementation of DFBGN**: Described a practical implementation of RSDFO - GN, called DFBGN (Derivative - Free Block Gauss - Newton method). Compared with existing methods, DFBGN reduces the cost of model construction and initial objective function evaluation by allowing fewer interpolation points to be used per iteration. To enable DFBGN to maintain similar evaluation efficiency as existing methods while reducing linear algebra costs, several modifications were made to the theoretical framework, especially in the selection of interpolation points and search subspaces.
### Theoretical Results
- **Probabilistic Worst - Case Complexity Bounds**: Derived the probabilistic worst - case complexity bounds of RSDFO, specifically in the forms:
\[
P\left[\min_{j \leq k} \|\nabla f(x_j)\| \leq C k^{-1/2}\right] \geq 1 - e^{-c k}
\]
and
\[
P\left[K_\epsilon \leq C \epsilon^{-2}\right] \leq 1 - e^{-c \epsilon^{-2}}
\]
where \(K_\epsilon\) is the number of iterations when first - order optimality \(\epsilon\) is reached.
- **Convergence Guarantees**: Provided several methods for determining random subspaces and proved that the construction based on the Johnson - Lindenstrauss transformation can achieve convergence when the subspace dimension is independent of the ambient dimension.
### Experimental Results
- **Comparison with Existing Methods**: Compared DFBGN with DFO - LS on medium - scale (about 100 - dimensional) and large - scale (about 1000 - dimensional) test problems. The results show that DFBGN performs better in terms of running time, especially on large - scale problems. When the subspace dimension is reduced, DFBGN can make better progress with fewer objective function evaluations.
In conclusion, this paper significantly improves the scalability and efficiency of large - scale derivative - free optimization problems by proposing the RSDFO framework and its specialization RSDFO - GN in nonlinear least - squares problems.