Batch covariance neural network for image recognition
Tianyou Zheng,Qiang Wang,Yue Shen,Xiang Ma,Xiaotian Lin
DOI: https://doi.org/10.1016/j.imavis.2022.104446
IF: 3.86
2022-06-01
Image and Vision Computing
Abstract:Recent work has shown that convolutional neural networks (CNN) can achieve state of the art if the datasets are well built. However, the existing convolutional layer is affected by various datasets with the inevitable problems of local abnormal features, i.e., illumination intensity and feature interaction. This paper replaces the convolutional layer with a batch covariance layer (BCL) to locate the category-related region unaffected by the problems. The BCL is regarded as a 3D covariance operation, which calculates the correlation between the kernels and feature maps in kernel size of all channels. Forward propagation, backward propagation, gradient updating, and testing procedure of the BCL are described. The comparison between BCL and convolutional layer shows the ability of BCL to reduce the influence of illumination intensity and feature interaction for discriminating and generating tasks. Complexity analysis shows that BCL can improve the accuracy with a thimbleful time consumption increase. Besides, the batch covariance neural network (BCovNN) is extended from the CNN by replacing the convolutional layer with BCL. Ablation experiment verifies the improvement of BCovNN is provided by BCL separately. BCovNN is evaluated on several popular datasets (i.e., MNIST, STL-10, CIFAR-10, and ImageNet) for image recognition and PASCAL VOC (2007 and 2012) datasets for object localization. Experimental results reveal that BCovNN achieves significant improvements over the corresponding CNN.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, software engineering,optics