Abstract:The linear conjugate gradient method is an efficient iterative method for the convex quadratic minimization problems $ \mathop {\min }\limits_{x \in { \mathbb R^n}} f(x) =\dfrac{1}{2}x^TAx+b^Tx $, where $ A \in R^{n \times n} $ is symmetric and positive definite and $ b \in R^n $. It is generally agreed that the gradients $ g_k $ are not conjugate with respective to $ A $ in the linear conjugate gradient method (see page 111 in Numerical optimization (2nd, Springer, 2006) by Nocedal and Wright). In the paper we prove the conjugacy of the gradients $ g_k $ generated by the linear conjugate gradient method, namely, $$g_k^TAg_i=0, \; i=0,1,\cdots, k-2.$$ In addition,a new way is exploited to derive the linear conjugate gradient method based on the conjugacy of the search directions and the orthogonality of the gradients, rather than the conjugacy of the search directions and the exact stepsize.

What problem does this paper attempt to address?

This paper attempts to solve a fundamental problem in the linear conjugate gradient method, namely the conjugacy of gradients. Specifically, the paper aims to prove that the gradients $ g_k $ generated by the linear conjugate gradient method are conjugate with respect to matrix $ A $, that is, they satisfy the following condition: \[ g_k^T A g_i = 0, \quad i = 0, 1, \cdots, k - 2 \] This conclusion is different from the traditional view, which holds that when using the linear conjugate gradient method, the gradients $ g_k $ are not conjugate with respect to matrix $ A $. Through strict mathematical proofs, the paper corrects this common misunderstanding and provides a new theoretical basis. In addition, the paper also proposes a new method based on gradient orthogonality and search - direction conjugacy to derive the linear conjugate gradient method. This method is different from the traditional method based on exact step - sizes. Instead, it selects the step - size $ \alpha_k $ such that the new gradient $ g_{k + 1} $ is orthogonal to the current gradient $ g_k $, that is: \[ g_{k + 1}^T g_k = 0 \] The selection of this step - size can be expressed as: \[ \alpha_k = -\frac{g_k^T g_k}{g_k^T A d_k} \] The paper proves that this new method is equivalent to the traditional linear conjugate gradient method and has the same convergence properties, that is, it converges to the optimal solution $ x^* $ within at most $ n $ steps. In summary, the main contribution of this paper lies in correcting the common misunderstanding about the gradient conjugacy in the linear conjugate gradient method and providing a new derivation method, which provides new ideas for the design of optimization algorithms.

On the properties of the linear conjugate gradient method

The convergence of conjugate gradient method with nonmonotone line search

Convergence Properties of Nonlinear Conjugate Gradient Methods.

Convergence Analysis of Gradient Algorithms on Riemannian Manifolds Without Curvature Constraints and Application to Riemannian Mass

Convergence analysis of the Gauss–Newton method for convex inclusion and convex-composite optimization problems

The conjugate gradient method with various viewpoints

A Revised Conjugate Gradient Projection Algorithm for Inequality Constrained Optimizations

Linear Convergence of Subgradient Algorithm for Convex Feasibility on Riemannian Manifolds

A Modified Nonlinear Conjugate Gradient Algorithm for Large-Scale Nonsmooth Convex Optimization

A Modified Conjugacy Condition And Related Nonlinear Conjugate Gradient Method

Contractivity and linear convergence in bilinear saddle-point problems: An operator-theoretic approach

Modified Conjugate Gradient Method with Global Convergence Property

Barzilai and Borwein conjugate gradient method equipped with a non-monotone line search technique and its application on non-negative matrix factorization

A Cooperative Conjugate Gradient Method for Linear Systems Permitting Multithread Implementation of Low Complexity

A semismooth conjugate gradients method – theoretical analysis

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition

Convergence of the Descent Nonlinear Conjugate Gradient Methods

A modified inertial three-term conjugate gradient method for nonsmooth convex optimization and its application

On a Family of Relaxed Gradient Descent Methods for Quadratic Minimization

A family of three-term conjugate gradient projection methods with a restart procedure and their relaxed-inertial extensions for the constrained nonlinear pseudo-monotone equations with applications

Linear convergence of forward-backward accelerated algorithms without knowledge of the modulus of strong convexity