Study of the behaviour of Nesterov Accelerated Gradient in a non convex setting: the strongly quasar convex case

J Hermant,J.-F Aujol,C Dossal,A Rondepierre
2024-05-30
Abstract:We study the convergence of Nesterov Accelerated Gradient (NAG) minimization algorithm applied to a class of non convex functions called strongly quasar convex functions, which can exhibit highly non convex behaviour. We show that in the case of strongly quasar convex functions, NAG can achieve an accelerated convergence speed at the cost of a lower curvature assumption. We provide a continuous analysis through high resolution ODEs, in which negative friction may appear. Finally, we investigate connections with a weaker class of non convex functions (smooth Polyak-Łojasiewicz functions) by characterizing the gap between this class and the one of smooth strongly quasar convex functions.
Optimization and Control
What problem does this paper attempt to address?
This paper mainly explores the behavior of the Nesterov's accelerated gradient (NAG) algorithm in non-convex environments, especially for a class of problems called strongly quasiconvex functions. Although these functions may exhibit highly non-convex behavior, they have the property of a unique global minimum. The study found that in the scenario of strongly quasiconvex functions, NAG can achieve accelerated convergence speed while sacrificing lower curvature assumptions. The paper first reviews the advantages of NAG in convex optimization, such as faster convergence to the minimum compared to gradient descent. Then, the paper points out that the acceleration phenomenon of NAG is not always applicable in non-convex optimization, and discusses some related research works that have achieved acceleration of NAG under different assumptions. The main contributions of the paper include: 1. Demonstrating how NAG achieves accelerated convergence by adding lower curvature assumptions in the class of smooth strongly quasiconvex functions. 2. Providing a continuous analysis of NAG's behavior in the strongly quasiconvex setting, revealing non-convex behaviors (such as negative friction) that may be caused by the gradient correction term. 3. Establishing geometric necessary conditions for accelerated convergence for a weaker class of non-convex functions (Polyak-Łojasiewicz functions), and showcasing new properties of strongly quasiconvex functions, including convex behavior on average. The paper has a clear structure, starting from preliminary knowledge, gradually discussing the convergence of NAG on strongly quasiconvex functions, then conducting continuous analysis, and presenting new properties of strongly quasiconvex functions. Finally, the theoretical results are verified through numerical experiments.