General single-loop methods for bilevel parameter learning

Ensio Suonperä,Tuomo Valkonen

2024-08-15

Abstract:Bilevel optimisation is used in inverse problems for hyperparameter learning and experimental design. For instance, it can be used to find optimal regularisation parameters and forward operators, based on a set of training pairs. However, computationally, the process is costly. To reduce this cost, recently in bilevel optimisation research, especially as applied to machine learning, so-called single-loop approaches have been introduced. On each step of an outer optimisation method, such methods only take a single gradient descent step towards the solution of the inner problem. In this paper, we flexibilise the inner algorithm, to allow for methods more applicable to difficult inverse problems with nonsmooth regularisation, including primal-dual proximal splitting (PDPS). Moreover, as we have recently shown, significant performance improvements can be obtained in PDE-constrained optimisation by interweaving the steps of conventional iterative solvers (Jacobi, Gauss-Seidel, conjugate gradients) for both the PDE and its adjoint, with the steps of the optimisation method. In this paper we demonstrate how the adjoint equation in bilevel problems can also benefit from such interweaving with conventional linear system solvers. We demonstrate the performance of our proposed methods on learning the deconvolution kernel for image deblurring, and the subsampling operator for magnetic resonance imaging (MRI).

Optimization and Control

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to reduce the computational cost of bilevel optimization when solving inverse problems. Specifically, bilevel optimization is widely used in areas such as hyper - parameter learning and experimental design, for example, to find the optimal regularization parameters and forward operators. However, this optimization process is computationally expensive because it requires solving the inner problem (usually a costly inverse problem) multiple times in order to find the required optimal parameters. To reduce this cost, recent research, especially in the field of machine learning, has introduced so - called "single - loop" methods. These methods perform only one gradient descent step on the objective function of the inner problem in each iteration of the outer optimization algorithm. This paper further flexibilizes the inner algorithm, allowing the use of methods more suitable for difficult inverse problems, such as Primal - Dual Proximal Splitting (PDPS) with non - smooth regularization. In addition, the author also shows that significant performance improvements can be obtained in PDE - constrained optimization by interleaving the steps of traditional iterative solvers (such as the Jacobi method, Gauss - Seidel method, conjugate gradient method) with optimization methods. This paper further explores how to apply this interleaving technique in the adjoint equations of bilevel problems and shows the performance of the proposed method in learning the deconvolution kernel for image deblurring and the undersampling operator in magnetic resonance imaging (MRI).

General single-loop methods for bilevel parameter learning

Derivative-free stochastic bilevel optimization for inverse problems

A Primal-Dual Approach to Bilevel Optimization with Multiple Inner Minima

An Adaptively Inexact Method for Bilevel Learning Using Primal-Dual Style Differentiation

Efficient gradient-based methods for bilevel learning via recycling Krylov subspaces

Double Momentum Method for Lower-Level Constrained Bilevel Optimization

LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace

A Generalized Alternating Method for Bilevel Learning under the Polyak-Łojasiewicz Condition

On Momentum-Based Gradient Methods for Bilevel Optimization with Nonconvex Lower-Level

A Loopless Distributed Algorithm for Personalized Bilevel Optimization

Bilevel Optimization under Unbounded Smoothness: A New Algorithm and Convergence Analysis

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems

An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning

A Single-Loop Algorithm for Decentralized Bilevel Optimization

A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints

A Gradient-based Bilevel Optimization Approach for Tuning Hyperparameters in Machine Learning

On Penalty-based Bilevel Gradient Descent Method

An Inexact Conditional Gradient Method for Constrained Bilevel Optimization

Bilevel learning of regularization models and their discretization for image deblurring and super-resolution