Abstract:This research paper presents an innovative approach to gradient descent known as ''Sample Gradient Descent''. This method is a modification of the conventional batch gradient descent algorithm, which is often associated with space and time complexity issues. The proposed approach involves the selection of a representative sample of data, which is subsequently subjected to batch gradient descent. The selection of this sample is a crucial task, as it must accurately represent the entire dataset. To achieve this, the study employs the use of Principle Component Analysis (PCA), which is applied to the training data, with a condition that only those rows and columns of data that explain 90% of the overall variance are retained. This approach results in a convex loss function, where a global minimum can be readily attained. Our results indicate that the proposed method offers faster convergence rates, with reduced computation times, when compared to the conventional batch gradient descent algorithm. These findings demonstrate the potential utility of the ''Sample Gradient Descent'' technique in various domains, ranging from machine learning to optimization problems. In our experiments, both approaches were run for 30 epochs, with each epoch taking approximately 3.41 s. Notably, our ''Sample Gradient Descent'' approach exhibited remarkable performance, converging in just 8 epochs, while the conventional batch gradient descent algorithm required 20 epochs to achieve convergence. This substantial difference in convergence rates, along with reduced computation times, highlights the superior efficiency of our proposed method. These findings underscore the potential utility of the ''Sample Gradient Descent'' technique across diverse domains, ranging from machine learning to optimization problems. The significant improvements in convergence rates and computation times make our algorithm particularly appealing to practitioners and researchers seeking enhanced efficiency in gradient descent optimization.

Accelerating Minibatch Stochastic Gradient Descent Using Typicality Sampling

Towards Better Generalization of Deep Neural Networks via Non-Typicality Sampling Scheme

Accelerating Stochastic Gradient Descent Using Antithetic Sampling.

Optimal Adaptive and Accelerated Stochastic Gradient Descent

Asynchronous Accelerated Stochastic Gradient Descent.

Acceleration of stochastic gradient descent with momentum by averaging: finite-sample rates and asymptotic normality

Dynamic of Stochastic Gradient Descent with State-Dependent Noise

Aiming towards the minimizers: fast convergence of SGD for overparametrized problems

ADASS: Adaptive Sample Selection for Training Acceleration

"Oddball SGD": Novelty Driven Stochastic Gradient Descent for Training Deep Neural Networks

Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise

Drill the Cork of Information Bottleneck by Inputting the Most Important Data

Demystifying SGD with Doubly Stochastic Gradients

Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation

From big data to smart data: a sample gradient descent approach for machine learning

The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization

Stochastic normalized gradient descent with momentum for large-batch training

Stochastic Gradient Descent with Biased but Consistent Gradient Estimators

Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

Stochastic Gradient Descent with Adaptive Data

Stochastic Proximal Gradient Algorithm with Minibatches. Application to Large Scale Learning Models