PID Controller-Based Stochastic Optimization Acceleration for Deep Neural Networks

Haoqian Wang,Yi Luo,Wangpeng An,Qingyun Sun,Jun Xu,Lei Zhang

DOI: https://doi.org/10.1109/tnnls.2019.2963066

IF: 14.255

2020-12-01

IEEE Transactions on Neural Networks and Learning Systems

Abstract:Deep neural networks (DNNs) are widely used and demonstrated their power in many applications, such as computer vision and pattern recognition. However, the training of these networks can be time consuming. Such a problem could be alleviated by using efficient optimizers. As one of the most commonly used optimizers, stochastic gradient descent-momentum (SGD-M) uses past and present gradients for parameter updates. However, in the process of network training, SGD-M may encounter some drawbacks, such as the overshoot phenomenon. This problem would slow the training convergence. To alleviate this problem and accelerate the convergence of DNN optimization, we propose a proportional-integral-derivative (PID) approach. Specifically, we investigate the intrinsic relationships between the PID-based controller and SGD-M first. We further propose a PID-based optimization algorithm to update the network parameters, where the past, current, and change of gradients are exploited. Consequently, our proposed PID-based optimization alleviates the overshoot problem suffered by SGD-M. When tested on popular DNN architectures, it also obtains up to 50% acceleration with competitive accuracy. Extensive experiments about computer vision and natural language processing demonstrate the effectiveness of our method on benchmark data sets, including CIFAR10, CIFAR100, Tiny-ImageNet, and PTB. We have released the code at https://github.com/tensorboy/PIDOptimizer.

computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address the issue of feature selection instability in clinical prediction models using Electronic Medical Records (EMR) data. Specifically: 1. **Problem Background**: - Feature selection in high-dimensional EMR data tends to be unstable when faced with data resampling. - Automatic feature selection algorithms can cause significant fluctuations in feature weights when handling high-dimensional data, thereby affecting the stability and interpretability of the model. 2. **Research Objectives**: - Propose a method based on Feature Graph, utilizing the inherent structure in EMR data (such as temporal and hierarchical relationships) to enhance feature stability in linear models (e.g., logistic regression). - Validate the effectiveness and stability of this method through experiments predicting the readmission of heart disease patients within 6 months. 3. **Main Contributions**: - Introduced a novel approach by incorporating a Laplacian regularization term into the Lasso regression model, using the Feature Graph to stabilize feature selection. - Validated the effectiveness of this method on real clinical datasets, demonstrating its superiority over traditional methods in terms of feature stability and model fit.

PID Controller-Based Stochastic Optimization Acceleration for Deep Neural Networks

Accelerated Optimization in Deep Learning with a Proportional-Integral-derivative Controller

A PID Controller Approach for Stochastic Optimization of Deep Networks

PID controller‐based adaptive gradient optimizer for deep neural networks

DAPID: A Differential-adaptive PID Optimization Strategy for Neural Network Training

Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent

PID Parameters Auto-Tuning Method for Industrial Temperature Adjustment

A Gradient Optimization Based PID Tuning Approach on Quadrotor

A Proposal on Centralised and Distributed Optimisation Via Proportional-Integral-derivative Controllers (PID) Control Perspective

SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization.

Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware Multifaceted Optimizations

PID control algorithm based on multistrategy enhanced dung beetle optimizer and back propagation neural network for DC motor control

Accelerated Gradient-free Neural Network Training by Multi-convex Alternating Optimization

Optimal Adaptive and Accelerated Stochastic Gradient Descent

PIDNODEs: Neural ordinary differential equations inspired by a proportional–integral–derivative controller

Direct Heuristic Dynamic Programming Based on an Improved PID Neural Network

Stochastic Gradient Descent with Nonlinear Conjugate Gradient-Style Adaptive Momentum

An automatic learning rate decay strategy for stochastic gradient descent optimization methods in neural networks

Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent

Interpolatron: Interpolation or Extrapolation Schemes to Accelerate Optimization for Deep Neural Networks.

Perturbated Gradients Updating within Unit Space for Deep Learning