Optimization with First Order Algorithms

Charles Dossal,Samuel Hurault,Nicolas Papadakis
2024-10-25
Abstract:These notes focus on the minimization of convex functionals using first-order optimization methods, which are fundamental in many areas of applied mathematics and engineering. The primary goal of this document is to introduce and analyze the most classical first-order optimization algorithms. We aim to provide readers with both a practical and theoretical understanding in how and why these algorithms converge to minimizers of convex functions. The main algorithms covered in these notes include gradient descent, Forward-Backward splitting, Douglas-Rachford splitting, the Alternating Direction Method of Multipliers (ADMM), and Primal-Dual algorithms. All these algorithms fall into the class of first order methods, as they only involve gradients and subdifferentials, that are first order derivatives of the functions to optimize. For each method, we provide convergence theorems, with precise assumptions and conditions under which the convergence holds, accompanied by complete proofs. Beyond convex optimization, the final part of this manuscript extends the analysis to nonconvex problems, where we discuss the convergence behavior of these same first-order methods under broader assumptions. To contextualize the theory, we also include a selection of practical examples illustrating how these algorithms are applied in different image processing problems.
Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to optimize convex functions and their extension to the minimization of non - convex functions. Specifically, the paper focuses on using first - order optimization methods to minimize convex functions, which are fundamental in many fields of applied mathematics and engineering. The main objective is to introduce and analyze the most classic first - order optimization algorithms, and provide practical and theoretical understanding of how and why these algorithms can converge to the minimum value of convex functions. ### Main Research Contents 1. **Background Knowledge**: The paper first reviews the relevant definitions and properties of convex functions, non - expansive operators and smooth (differentiable) functions, laying a theoretical foundation for the subsequent chapters. 2. **Optimization of Smooth Convex Functions**: The gradient descent algorithm and its variants are discussed in detail, including fixed - step - size gradient descent, optimal - step - size gradient descent, Newton's method, gradient descent with backtracking and implicit gradient descent. The convergence rates and properties of these algorithms are analyzed. 3. **Optimization of Non - Smooth Convex Functions**: The concept of sub - differential, proximal operator and their properties are introduced, and various proximal splitting algorithms are discussed, such as proximal point algorithm, forward - backward algorithm, fast iterative shrinkage - thresholding algorithm (FISTA), Douglas - Rachford algorithm, parallel proximal algorithm (PPXA) and alternating direction method of multipliers (ADMM). 4. **Duality**: The properties of convex conjugate and primal - dual algorithms are explored, such as saddle point problems, Chambolle - Pock algorithm and Condat algorithm, and the equivalence between different proximal splitting algorithms is discussed. 5. **Extension to Non - convex Optimization**: The above algorithms are applied to non - convex optimization problems, and the convergence behaviors of these algorithms under broader assumptions are discussed, especially the single - point convergence of the forward - backward algorithm and the Kurdyka - Łojasiewicz (KŁ) property. 6. **Application Examples**: The practical applications of these optimization algorithms are demonstrated through specific examples in image processing problems. ### Formula Summary - **Gradient Descent Algorithm**: \[ x_{n + 1}=x_n-\gamma\nabla f(x_n) \] - **Strong Convexity**: \[ f(y)\geq f(x)+\langle\nabla f(x), y - x\rangle+\frac{\alpha}{2}\|y - x\|^2 \] - **L - Smoothness**: \[ f(z)\leq f(y)+\langle\nabla f(y), z - y\rangle+\frac{L}{2}\|z - y\|^2 \] - **Conjugacy**: \[ f^*(y)=\sup_{x\in E}(\langle y, x\rangle - f(x)) \] - **Convergence Rate of Gradient Descent**: \[ f(x_n)-f(x^*)\leq\frac{L}{2n}\|x_0 - x^*\|^2 \] ### Conclusion Through detailed theoretical analysis and practical application examples, the paper systematically introduces the applications of various first - order optimization algorithms in convex and non - convex optimization problems. These algorithms not only have good convergence properties in theory, but also show strong practical value in practical problems.