Abstract:In this paper, we propose a new multilevel stochastic framework for the solution of optimization problems. The proposed approach uses random regularized first-order models that exploit an available hierarchical description of the problem, being either in the classical variable space or in the function space, meaning that different levels of accuracy for the objective function are available. The converge analysis of the method is conducted and its numerical behavior is tested on the solution of finite-sum minimization problems. Indeed, the multilevel framework is tailored to the solution of such problems resulting in fact in a nontrivial variance reduction technique with adaptive step-size that outperforms standard approaches when solving nonconvex problems. Differently from classical deterministic multilevel methods, our stochastic method does not require the finest approximation to coincide with the original objective function. This allows to avoid the evaluation of the full sum in finite-sum minimization problems, opening at the solution of classification problems with large data sets.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the solution of large - scale stochastic optimization problems, especially when the objective function value can only be calculated in a noisy way. Specifically, the paper proposes a new multilevel stochastic regularized first - order method for solving optimization problems. This method utilizes the hierarchical description of the problem (either in the classical variable space or in the function space), thus providing different precision representations of the objective function at different levels. ### Main problems 1. **Limitations of existing methods**: - Traditional multilevel methods are only applicable to deterministic contexts and cannot handle stochastic optimization problems. - Existing multilevel methods usually rely on the hierarchical structure in the variable space, such as choosing a specific grid when discretizing infinite - dimensional problems, while in modern applications, it is more common that the accuracy of function estimation becomes a limiting factor rather than the size of the model. 2. **Classification problems under large - data sets**: - When dealing with large - data sets, traditional methods need to evaluate the full sum, which is very time - consuming and infeasible in practical applications. ### Solutions The paper proposes a multilevel method extended to the stochastic environment, allowing the construction of a hierarchical structure in the "function space", that is, using function approximations with different precisions. This method can not only construct a hierarchical structure in the variable space but also construct a hierarchical structure of function approximations by reducing noise. ### Key contributions 1. **First extension of the multilevel method to the stochastic framework**: Overcomes the limitation that existing methods are limited to deterministic cases. 2. **Allows the construction of a hierarchical structure in the function space**: Considers function approximations with different precisions. 3. **Solves the theoretical convergence problem of the classical deterministic multilevel method**: Does not require that the objective function at the finest level be consistent with the original objective function, making the method applicable to problems of excessive scale. 4. **Proposes a variance reduction technique for the finite - sum minimization problem**: Has a selection mechanism with adaptive step sizes and outperforms mini - batch SVRG on non - convex problems. 5. **Provides the first stochastic analysis of the first - order adaptive regularization method**: Covers the classical single - layer case. ### Application background This method is particularly suitable for classification problems on large - scale data sets, such as the training problems common in deep learning. By reducing the variance and adaptively selecting the step size, this method can significantly improve the solution speed while maintaining high precision. ### Mathematical formulas - The finite - sum form of the objective function: \[ \min_{x\in\mathbb{R}^n}\frac{1}{N}\sum_{i = 1}^{N}f_i(x) \] where \(f_i:\mathbb{R}^n\rightarrow\mathbb{R}\) are smooth and bounded - below functions. - The form of the regularization model: \[ m_{R,\ell}^k(s)=m_\ell^k(s)+\lambda_\ell^k\|\nabla_x f_\ell(x_\ell^k)\|^2\|s\|^2 \] Through these improvements, this method performs well in handling large - scale stochastic optimization problems, especially significantly outperforming the existing mini - batch SVRG method on non - convex problems.

A multilevel stochastic regularized first-order method with application to training

On high-order multilevel optimization strategies

Parallel Stochastic Optimization Framework for Large-Scale Non-Convex Stochastic Problems

Multilevel Regularized Newton Methods with Fast Convergence Rates

A Multilevel Low-Rank Newton Method with Super-linear Convergence Rate and its Application to Non-convex Problems

Stochastic Sub-Sampled Newton Method with Variance Reduction

A Multilevel Approach for Stochastic Nonlinear Optimal Control

A Stochastic Objective-Function-Free Adaptive Regularization Method with Optimal Complexity

A framework for bilevel optimization that enables stochastic and global variance reduction algorithms

A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization.

An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness

A stochastic regularized second-order iterative scheme for optimal control and inverse problems in stochastic partial differential equations

Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems

Zeroth-Order Federated Methods for Stochastic MPECs and Nondifferentiable Nonconvex Hierarchical Optimization

Newton-type multilevel optimization method

Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions

Optimal Algorithms for Stochastic Multi-Level Compositional Optimization.

Convergence analysis of stochastic higher-order majorization-minimization algorithms

The Stochastic Steepest Descent Method for Robust Optimization in Banach Spaces

Randomized Stochastic Variance-Reduced Methods for Multi-Task Stochastic Bilevel Optimization

A Stochastic Quasi-Newton Method for Non-convex Optimization with Non-uniform Smoothness