Abstract:Neural saturation in Deep Neural Networks (DNNs) has been studied extensively, but remains relatively unexplored in Convolutional Neural Networks (CNNs). Understanding and alleviating the effects of convolutional kernel saturation is critical for enhancing CNN models classification accuracies. In this paper, we analyze the effect of convolutional kernel saturation in CNNs and propose a simple data augmentation technique to mitigate saturation and increase classification accuracy, by supplementing negative images to the training dataset. We hypothesize that greater semantic feature information can be extracted using negative images since they have the same structural information as standard images but differ in their data representations. Varied data representations decrease the probability of kernel saturation and thus increase the effectiveness of kernel weight updates. The two datasets selected to evaluate our hypothesis were CIFAR- 10 and STL-10 as they have similar image classes but differ in image resolutions thus making for a better understanding of the saturation phenomenon. MNIST dataset was used to highlight the ineffectiveness of the technique for linearly separable data. The ResNet CNN architecture was chosen since the skip connections in the network ensure the most important features contributing the most to classification accuracy are retained. Our results show that CNNs are indeed susceptible to convolutional kernel saturation and that supplementing negative images to the training dataset can offer a statistically significant increase in classification accuracies when compared against models trained on the original datasets. Our results present accuracy increases of 6.98% and 3.16% on the STL-10 and CIFAR-10 datasets respectively.

Noisy Softmax: Improving the Generalization Ability of DCNN Via Postponing the Early Softmax Saturation

Image Classification Method Based on Improved DCNN Combined Transfer Learning

Regularization and Iterative Initialization of Softmax for Fast Training of Convolutional Neural Networks.

SSN: Learning Sparse Switchable Normalization via SparsestMax

Improving Convolutional Neural Networks Via Compacting Features

Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond

Soft Merging: A Flexible and Robust Soft Model Merging Approach for Enhanced Neural Network Performance

Noisy Truncated SGD: Optimization and Generalization

Selective Output Smoothing Regularization: Regularize Neural Networks by Softening Output Distributions

Improved generalization performance of convolutional neural networks with LossDA

Stop-Gradient Softmax Loss for Deep Metric Learning.

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

Denoising Noisy Neural Networks: A Bayesian Approach with Compensation.

Ensemble Soft-Margin Softmax Loss for Image Classification

Examining and Mitigating Kernel Saturation in Convolutional Neural Networks using Negative Images

Improving Classification Performance of Softmax Loss Function Based on Scalable Batch-Normalization

COMO: Efficient Deep Neural Networks Expansion With COnvolutional MaxOut

On the Learning Property of Logistic and Softmax Losses for Deep Neural Networks

Effective Domain Knowledge Transfer with Soft Fine-tuning

COMO: Widening Deep Neural Networks with COnvolutional MaxOut

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach