ChannelDropBack: Forward-Consistent Stochastic Regularization for Deep Networks

Evgeny Hershkovitch Neiterman,Gil Ben-Artzi
2024-11-17
Abstract:Incorporating stochasticity into the training process of deep convolutional networks is a widely used technique to reduce overfitting and improve regularization. Existing techniques often require modifying the architecture of the network by adding specialized layers, are effective only to specific network topologies or types of layers - linear or convolutional, and result in a trained model that is different from the deployed one. We present ChannelDropBack, a simple stochastic regularization approach that introduces randomness only into the backward information flow, leaving the forward pass intact. ChannelDropBack randomly selects a subset of channels within the network during the backpropagation step and applies weight updates only to them. As a consequence, it allows for seamless integration into the training process of any model and layers without the need to change its architecture, making it applicable to various network topologies, and the exact same network is deployed during training and inference. Experimental evaluations validate the effectiveness of our approach, demonstrating improved accuracy on popular datasets and models, including ImageNet and ViT. Code is available at \url{<a class="link-external link-https" href="https://github.com/neiterman21/ChannelDropBack.git" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the over - fitting problem faced by existing deep neural networks during the training process and the limitations of existing stochastic regularization methods in practical applications. Specifically: 1. **Over - fitting problem**: As deep neural networks become deeper and more complex, over - fitting remains a challenge. To address this issue, researchers have developed various regularization techniques, but these techniques often have some limitations. 2. **Limitations of existing stochastic regularization methods**: - **Architecture modification**: Many existing stochastic regularization methods (such as Dropout, DropConnect, Stochastic Depth, etc.) require modification of the network architecture, adding special layers or changing the network structure. - **Inconsistency between training and inference**: These methods usually result in different models used during training and inference, thus introducing potential inconsistencies. - **Limited scope of application**: Some methods are only applicable to specific types of network topologies or layer types (e.g., linear layers or convolutional layers), limiting their wide application. To solve these problems, the author proposes ChannelDropBack, a new stochastic regularization technique. The main goal of ChannelDropBack is to improve the training of deep convolutional networks in the following ways: - **Keep the forward pass unchanged**: ChannelDropBack introduces randomness only during the back - propagation process, without changing the forward - pass process. - **No need to modify the network architecture**: It can be seamlessly integrated into any model and layer without changing the network architecture. - **Ensure the consistency of training and inference**: Since the forward pass remains unchanged, the same network structure is used in the training and inference stages, eliminating the differences between training and deployment. Through these improvements, ChannelDropBack aims to improve the generalization ability of the model and is applicable to various network topologies and datasets. Experimental results show that ChannelDropBack significantly improves accuracy and robustness on multiple popular datasets and models.