A Theoretical View of Linear Backpropagation and Its Convergence
Ziang Li,Yiwen Guo,Haodi Liu,Changshui Zhang
DOI: https://doi.org/10.1109/tpami.2024.3353919
IF: 23.6
2024-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:Backpropagation (BP) is widely used for calculating gradients in deep neural networks (DNNs). Applied often along with stochastic gradient descent (SGD) or its variants, BP is considered as a de-facto choice in a variety of machine learning tasks including DNN training and adversarial attack/defense. Recently, a linear variant of BP named LinBP was introduced for generating more transferable adversarial examples for performing black-box attacks, by Guo et al. [1]. Although it has been shown empirically effective in black-box attacks, theoretical studies and convergence analyses of such a method is lacking. This paper serves as a complement and somewhat an extension to Guo et al.'s paper, by providing theoretical analyses on LinBP in neural-network-involved learning tasks, including adversarial attack and model training. We demonstrate that, somewhat surprisingly, LinBP can lead to faster convergence in these tasks in the same hyper-parameter settings, compared to BP. We confirm our theoretical results with extensive experiments. Code for reproducing our experimental results will be publicly available.
computer science, artificial intelligence,engineering, electrical & electronic