Abstract:Incorporating stochasticity into the training process of deep convolutional networks is a widely used technique to reduce overfitting and improve regularization. Existing techniques often require modifying the architecture of the network by adding specialized layers, are effective only to specific network topologies or types of layers - linear or convolutional, and result in a trained model that is different from the deployed one. We present ChannelDropBack, a simple stochastic regularization approach that introduces randomness only into the backward information flow, leaving the forward pass intact. ChannelDropBack randomly selects a subset of channels within the network during the backpropagation step and applies weight updates only to them. As a consequence, it allows for seamless integration into the training process of any model and layers without the need to change its architecture, making it applicable to various network topologies, and the exact same network is deployed during training and inference. Experimental evaluations validate the effectiveness of our approach, demonstrating improved accuracy on popular datasets and models, including ImageNet and ViT. Code is available at \url{<a class="link-external link-https" href="https://github.com/neiterman21/ChannelDropBack.git" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the over - fitting problem faced by existing deep neural networks during the training process and the limitations of existing stochastic regularization methods in practical applications. Specifically: 1. **Over - fitting problem**: As deep neural networks become deeper and more complex, over - fitting remains a challenge. To address this issue, researchers have developed various regularization techniques, but these techniques often have some limitations. 2. **Limitations of existing stochastic regularization methods**: - **Architecture modification**: Many existing stochastic regularization methods (such as Dropout, DropConnect, Stochastic Depth, etc.) require modification of the network architecture, adding special layers or changing the network structure. - **Inconsistency between training and inference**: These methods usually result in different models used during training and inference, thus introducing potential inconsistencies. - **Limited scope of application**: Some methods are only applicable to specific types of network topologies or layer types (e.g., linear layers or convolutional layers), limiting their wide application. To solve these problems, the author proposes ChannelDropBack, a new stochastic regularization technique. The main goal of ChannelDropBack is to improve the training of deep convolutional networks in the following ways: - **Keep the forward pass unchanged**: ChannelDropBack introduces randomness only during the back - propagation process, without changing the forward - pass process. - **No need to modify the network architecture**: It can be seamlessly integrated into any model and layer without changing the network architecture. - **Ensure the consistency of training and inference**: Since the forward pass remains unchanged, the same network structure is used in the training and inference stages, eliminating the differences between training and deployment. Through these improvements, ChannelDropBack aims to improve the generalization ability of the model and is applicable to various network topologies and datasets. Experimental results show that ChannelDropBack significantly improves accuracy and robustness on multiple popular datasets and models.

ChannelDropBack: Forward-Consistent Stochastic Regularization for Deep Networks

Wordreg: Mitigating the Gap Between Training and Inference with Worst-Case Drop Regularization

DCCD: Reducing Neural Network Redundancy Via Distillation

Randomness Regularization with Simple Consistency Training for Neural Networks

R-Drop: Regularized Dropout for Neural Networks.

AutoDropout: Learning Dropout Patterns to Regularize Deep Networks

TargetDrop: A Targeted Regularization Method for Convolutional Neural Networks

Continuous Dropout

Subdomain contraction in deep networks for robust representation learning

Rethinking the Usage of Batch Normalization and Dropout in the Training of Deep Neural Networks

PLACE dropout: A Progressive Layer-wise and Channel-wise Dropout for Domain Generalization

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Dropout Reduces Underfitting

Effective and Efficient Dropout for Deep Convolutional Neural Networks

DomainDrop: Suppressing Domain-Sensitive Channels for Domain Generalization

LayerOut: Freezing Layers in Deep Neural Networks

Shakeout: A New Approach to Regularized Deep Neural Network Training

Regularizing neural networks with adaptive local drop

LocalDrop: A Hybrid Regularization for Deep Neural Networks

Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks