General single-loop methods for bilevel parameter learning

Ensio Suonperä,Tuomo Valkonen
2024-08-15
Abstract:Bilevel optimisation is used in inverse problems for hyperparameter learning and experimental design. For instance, it can be used to find optimal regularisation parameters and forward operators, based on a set of training pairs. However, computationally, the process is costly. To reduce this cost, recently in bilevel optimisation research, especially as applied to machine learning, so-called single-loop approaches have been introduced. On each step of an outer optimisation method, such methods only take a single gradient descent step towards the solution of the inner problem. In this paper, we flexibilise the inner algorithm, to allow for methods more applicable to difficult inverse problems with nonsmooth regularisation, including primal-dual proximal splitting (PDPS). Moreover, as we have recently shown, significant performance improvements can be obtained in PDE-constrained optimisation by interweaving the steps of conventional iterative solvers (Jacobi, Gauss-Seidel, conjugate gradients) for both the PDE and its adjoint, with the steps of the optimisation method. In this paper we demonstrate how the adjoint equation in bilevel problems can also benefit from such interweaving with conventional linear system solvers. We demonstrate the performance of our proposed methods on learning the deconvolution kernel for image deblurring, and the subsampling operator for magnetic resonance imaging (MRI).
Optimization and Control
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to reduce the computational cost of bilevel optimization when solving inverse problems. Specifically, bilevel optimization is widely used in areas such as hyper - parameter learning and experimental design, for example, to find the optimal regularization parameters and forward operators. However, this optimization process is computationally expensive because it requires solving the inner problem (usually a costly inverse problem) multiple times in order to find the required optimal parameters. To reduce this cost, recent research, especially in the field of machine learning, has introduced so - called "single - loop" methods. These methods perform only one gradient descent step on the objective function of the inner problem in each iteration of the outer optimization algorithm. This paper further flexibilizes the inner algorithm, allowing the use of methods more suitable for difficult inverse problems, such as Primal - Dual Proximal Splitting (PDPS) with non - smooth regularization. In addition, the author also shows that significant performance improvements can be obtained in PDE - constrained optimization by interleaving the steps of traditional iterative solvers (such as the Jacobi method, Gauss - Seidel method, conjugate gradient method) with optimization methods. This paper further explores how to apply this interleaving technique in the adjoint equations of bilevel problems and shows the performance of the proposed method in learning the deconvolution kernel for image deblurring and the undersampling operator in magnetic resonance imaging (MRI).