Convergence Analysis of Fractional Gradient Descent

Ashwani Aggarwal

2024-06-04

Abstract:Fractional derivatives are a well-studied generalization of integer order derivatives. Naturally, for optimization, it is of interest to understand the convergence properties of gradient descent using fractional derivatives. Convergence analysis of fractional gradient descent is currently limited both in the methods analyzed and the settings analyzed. This paper aims to fill in these gaps by analyzing variations of fractional gradient descent in smooth and convex, smooth and strongly convex, and smooth and non-convex settings. First, novel bounds will be established bridging fractional and integer derivatives. Then, these bounds will be applied to the aforementioned settings to prove linear convergence for smooth and strongly convex functions and $O(1/T)$ convergence for smooth and convex functions. Additionally, we prove $O(1/T)$ convergence for smooth and non-convex functions using an extended notion of smoothness - Hölder smoothness - that is more natural for fractional derivatives. Finally, empirical results will be presented on the potential speed up of fractional gradient descent over standard gradient descent as well as some preliminary theoretical results explaining this speed up.

Optimization and Control,Machine Learning,Numerical Analysis

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper primarily explores the convergence properties of the Fractional Gradient Descent method in different optimization scenarios. Specifically: 1. **Relationship between fractional-order derivatives and integer-order derivatives**: - The study establishes new inequalities between fractional-order derivatives and integer-order derivatives, which help in understanding the application of fractional-order derivatives in optimization. 2. **Linear convergence rate for smooth and strongly convex functions**: - For smooth and strongly convex functions, the paper proves that the Fractional Gradient Descent method can achieve linear convergence and provides a detailed convergence rate analysis. This analysis extends the work of Shin et al. (2021), which was limited to quadratic functions. 3. **O(1/T) convergence rate for smooth and convex functions**: - For smooth and convex functions, the paper demonstrates that the Fractional Gradient Descent method can achieve an O(1/T) convergence rate, similar to the standard Gradient Descent method. 4. **O(1/T) convergence rate for smooth but non-convex functions**: - For smooth but non-convex functions, the paper introduces the concept of Hölder smoothness and proves that the Fractional Gradient Descent method can achieve an O(1/T) convergence rate. 5. **Experimental results**: - The paper presents experimental evidence showing that the Fractional Gradient Descent method converges faster than the standard Gradient Descent method in certain cases and provides preliminary explanations for this acceleration. Through these studies, the paper aims to fill the current theoretical analysis gap of the Fractional Gradient Descent method and demonstrate its potential in practical applications.

Convergence Analysis of Fractional Gradient Descent

Generalization of the Gradient Method with Fractional Order Gradient Direction

Design of Generalized Fractional Order Gradient Descent Method

Study on Fractional Order Gradient Methods

Fractional Order Gradient Methods for a General Class of Convex Functions

Applications of fractional calculus in learned optimization

Convergence Analysis of Gradient Algorithms on Riemannian Manifolds Without Curvature Constraints and Application to Riemannian Mass

Concavifiability and convergence: necessary and sufficient conditions for gradient descent analysis

A novel perspective to gradient method: the fractional order approach

Provably Faster Gradient Descent via Long Steps

The Novel Adaptive Fractional Order Gradient Decent Algorithms Design via Robust Control

A Novel Fractional Order Speedest Gradient Descent Method and its application<sup>*</sup>

Convergence Analysis of Novel Fractional-Order Backpropagation Neural Networks With Regularization Terms

On The Unified Design Of Accelerated Gradient Descent

Study on fast speed fractional order gradient descent method and its application in neural networks

Convergence Analysis of Adaptive Gradient Methods under Refined Smoothness and Noise Assumptions

$Γ$-convergence involving nonlocal gradients with varying horizon: Recovery of local and fractional models

A critical analysis of the conformable derivative

Convergence Rate Analysis of Continuous- and Discrete-Time Smoothing Gradient Algorithms

Gradient Descent in the Absence of Global Lipschitz Continuity of the Gradients

Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent Flows