Scaling Supervised Local Learning with Augmented Auxiliary Networks

Chenxiang Ma,Jibin Wu,Chenyang Si,Kay Chen Tan

2024-02-27

Abstract:Deep neural networks are typically trained using global error signals that backpropagate (BP) end-to-end, which is not only biologically implausible but also suffers from the update locking problem and requires huge memory consumption. Local learning, which updates each layer independently with a gradient-isolated auxiliary network, offers a promising alternative to address the above problems. However, existing local learning methods are confronted with a large accuracy gap with the BP counterpart, particularly for large-scale networks. This is due to the weak coupling between local layers and their subsequent network layers, as there is no gradient communication across layers. To tackle this issue, we put forward an augmented local learning method, dubbed AugLocal. AugLocal constructs each hidden layer's auxiliary network by uniformly selecting a small subset of layers from its subsequent network layers to enhance their synergy. We also propose to linearly reduce the depth of auxiliary networks as the hidden layer goes deeper, ensuring sufficient network capacity while reducing the computational cost of auxiliary networks. Our extensive experiments on four image classification datasets (i.e., CIFAR-10, SVHN, STL-10, and ImageNet) demonstrate that AugLocal can effectively scale up to tens of local layers with a comparable accuracy to BP-trained networks while reducing GPU memory usage by around 40%. The proposed AugLocal method, therefore, opens up a myriad of opportunities for training high-performance deep neural networks on resource-constrained platforms.Code is available at

Neural and Evolutionary Computing,Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the lack of accuracy in existing supervised local learning methods when applied to large - scale networks, especially when the network has a large number of independently optimized layers. Specifically, due to the lack of feedback interaction between hidden layers, existing local learning methods can only learn representations suitable for their local goals, and cannot benefit from the information of subsequent layers like back - propagation (BP), resulting in a large accuracy gap compared with BP. In addition, local learning methods require a large amount of computing resources during the training process, especially in deep networks. To address these problems, the author proposes an enhanced local learning method - AugLocal. AugLocal strengthens the synergy between local layers and their subsequent layers by constructing an auxiliary network for each hidden layer. Specifically, AugLocal constructs an auxiliary network by uniformly selecting a small number of subsequent layers of the hidden layer, and proposes a pyramid structure to linearly reduce the depth of the auxiliary network as the hidden layer approaches the output layer, in order to reduce the computational cost. This method aims to improve the accuracy of local learning methods while reducing GPU memory usage, enabling it to efficiently train high - performance deep neural networks on resource - constrained platforms.

Scaling Supervised Local Learning with Augmented Auxiliary Networks

MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network

Adaptive Deep Spiking Neural Network with Global-Local Learning Via Balanced Excitatory and Inhibitory Mechanism

Advancing Supervised Local Learning Beyond Classification with Long-term Feature Bank

Momentum Auxiliary Network for Supervised Local Learning

BackLink: Supervised Local Training with Backward Links

Local Augmentation for Graph Neural Networks

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Local Unsupervised Learning for Image Analysis

Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification

Hierarchical Auxiliary Learning

Local Weight Coupled Network: Multi-Modal Unequal Semi-Supervised Domain Adaptation.

Local Methods with Adaptivity via Scaling

Local Augmentation Based Consistency Learning for Semi-Supervised Pathology Image Classification

Supervised Deep Learning with Auxiliary Networks

Graph-based Semi-Supervised Learning by Strengthening Local Label Consistency

Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation

Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

Interlocking Backpropagation: Improving depthwise model-parallelism

Free Lunches in Auxiliary Learning: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost

Go beyond End-to-End Training: Boosting Greedy Local Learning with Context Supply