Abstract:These notes focus on the minimization of convex functionals using first-order optimization methods, which are fundamental in many areas of applied mathematics and engineering. The primary goal of this document is to introduce and analyze the most classical first-order optimization algorithms. We aim to provide readers with both a practical and theoretical understanding in how and why these algorithms converge to minimizers of convex functions. The main algorithms covered in these notes include gradient descent, Forward-Backward splitting, Douglas-Rachford splitting, the Alternating Direction Method of Multipliers (ADMM), and Primal-Dual algorithms. All these algorithms fall into the class of first order methods, as they only involve gradients and subdifferentials, that are first order derivatives of the functions to optimize. For each method, we provide convergence theorems, with precise assumptions and conditions under which the convergence holds, accompanied by complete proofs. Beyond convex optimization, the final part of this manuscript extends the analysis to nonconvex problems, where we discuss the convergence behavior of these same first-order methods under broader assumptions. To contextualize the theory, we also include a selection of practical examples illustrating how these algorithms are applied in different image processing problems.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to optimize convex functions and their extension to the minimization of non - convex functions. Specifically, the paper focuses on using first - order optimization methods to minimize convex functions, which are fundamental in many fields of applied mathematics and engineering. The main objective is to introduce and analyze the most classic first - order optimization algorithms, and provide practical and theoretical understanding of how and why these algorithms can converge to the minimum value of convex functions. ### Main Research Contents 1. **Background Knowledge**: The paper first reviews the relevant definitions and properties of convex functions, non - expansive operators and smooth (differentiable) functions, laying a theoretical foundation for the subsequent chapters. 2. **Optimization of Smooth Convex Functions**: The gradient descent algorithm and its variants are discussed in detail, including fixed - step - size gradient descent, optimal - step - size gradient descent, Newton's method, gradient descent with backtracking and implicit gradient descent. The convergence rates and properties of these algorithms are analyzed. 3. **Optimization of Non - Smooth Convex Functions**: The concept of sub - differential, proximal operator and their properties are introduced, and various proximal splitting algorithms are discussed, such as proximal point algorithm, forward - backward algorithm, fast iterative shrinkage - thresholding algorithm (FISTA), Douglas - Rachford algorithm, parallel proximal algorithm (PPXA) and alternating direction method of multipliers (ADMM). 4. **Duality**: The properties of convex conjugate and primal - dual algorithms are explored, such as saddle point problems, Chambolle - Pock algorithm and Condat algorithm, and the equivalence between different proximal splitting algorithms is discussed. 5. **Extension to Non - convex Optimization**: The above algorithms are applied to non - convex optimization problems, and the convergence behaviors of these algorithms under broader assumptions are discussed, especially the single - point convergence of the forward - backward algorithm and the Kurdyka - Łojasiewicz (KŁ) property. 6. **Application Examples**: The practical applications of these optimization algorithms are demonstrated through specific examples in image processing problems. ### Formula Summary - **Gradient Descent Algorithm**: \[ x_{n + 1}=x_n-\gamma\nabla f(x_n) \] - **Strong Convexity**: \[ f(y)\geq f(x)+\langle\nabla f(x), y - x\rangle+\frac{\alpha}{2}\|y - x\|^2 \] - **L - Smoothness**: \[ f(z)\leq f(y)+\langle\nabla f(y), z - y\rangle+\frac{L}{2}\|z - y\|^2 \] - **Conjugacy**: \[ f^*(y)=\sup_{x\in E}(\langle y, x\rangle - f(x)) \] - **Convergence Rate of Gradient Descent**: \[ f(x_n)-f(x^*)\leq\frac{L}{2n}\|x_0 - x^*\|^2 \] ### Conclusion Through detailed theoretical analysis and practical application examples, the paper systematically introduces the applications of various first - order optimization algorithms in convex and non - convex optimization problems. These algorithms not only have good convergence properties in theory, but also show strong practical value in practical problems.

Optimization with First Order Algorithms

Complexity of a Class of First-Order Objective-Function-Free Optimization Algorithms

Formalization of Complexity Analysis of the First-order Algorithms for Convex Optimization

Optimal First-Order Algorithms as a Function of Inequalities

Accelerated First-Order Optimization under Nonlinear Constraints

Introduction to Nonsmooth Analysis and Optimization

First-order optimization on stratified sets

First-Order Methods for Nonsmooth Nonconvex Functional Constrained Optimization with or without Slater Points

First-Order Algorithms Without Lipschitz Gradient: A Sequential Local Optimization Approach

Analysis of Optimization Algorithms via Sum-of-Squares

Accelerated First-Order Optimization Algorithms for Machine Learning.

First-order Optimality Conditions for Two Classes of Generalized Nonsmooth Semi-Infinite Optimization

KKT Conditions, First-Order and Second-Order Optimization, and Distributed Optimization: Tutorial and Survey

Uniformly Optimal and Parameter-free First-order Methods for Convex and Function-constrained Optimization

On Fundamental Proof Structures in First-Order Optimization

A Smooth Primal-Dual Optimization Framework for Nonsmooth Composite Convex Minimization

First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems

Linearization Algorithms for Fully Composite Optimization

The complexity of first-order optimization methods from a metric perspective

Verification of First-Order Methods for Parametric Quadratic Optimization