Gabor capsule network with preprocessing blocks for the recognition of complex images

Mighty Abra Ayidzoe,Yongbin Yu,Patrick Kwabena Mensah,Jingye Cai,Kwabena Adu,Yifan Tang
DOI: https://doi.org/10.1007/s00138-021-01221-6
IF: 2.983
2021-06-09
Machine Vision and Applications
Abstract:Capsule network (CapsNet) is a novel concept demonstrating the importance of learning spatial hierarchical relationship between features for the effective recognition of images. However, the baseline capsule network is not suitable for the recognition of complex images leading to its poor performance on such images. This limitation can partially be attributed to the inability of CapsNets to extract important features from the input images as well as the attempt to account for every object in the image including background objects. To address these problems, we propose a variant of a capsule network that is less complex yet robust with strong feature extraction capabilities. The model uses the advantages of Gabor filter and custom preprocessing block to learn the structure and semantic information in the image. This enhances the extraction of only important features, resulting in improved activation diagrams that enable meaningful hierarchical information to be learned. Experimental results show that the proposed model can achieve 85.24%, 68.17%, 94.78% and 91.50% test accuracies on complex images such as CIFAR 10, CIFAR 100, fashion-MNIST and kvasir-dataset-v2 datasets, respectively. The performance of the proposed model is comparable to that of the state-of-the-art models on the five datasets with a relatively small number of parameters.
computer science, cybernetics, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?