An efficient medical image classification network based on multi-branch CNN, token grouping Transformer and mixer MLP

Shiwei Liu,Liejun Wang,Wenwen Yue
DOI: https://doi.org/10.1016/j.asoc.2024.111323
2024-02-03
Applied Soft Computing Journal
Abstract:In recent years, medical image classification techniques based on deep learning have made remarkable achievements, but most of the current models sacrifice the efficiency of the model for performance improvement. This poses a great challenge in practical clinical applications. Meanwhile, Convolutional Neural Network (CNN)-based methods, Visual Transformer(ViT)-based and Multi-layer Perceptron(MLP)-based methods have their own advantages and disadvantages in capturing local features and global features of medical images. And there is no good method to combine the three to achieve a better trade-off in model scale and performance. Based on the above problems, we propose Eff-CTM: an hybrid efficient medical image classification network based on multi-branch CNN, token grouping Transformer and mixer MLP. It combines the advantages of all three and takes a small number of parameters to classify pneumonia, colon cancer histopathology and dermatology images quickly and accurately. Eff-CTM uses an efficient CNN module with multi-branch structure to learn local detail information in the shallow CNN stage of the network, an efficient CNN, Transformer (ECT) module and efficient MLP (EM) module in the middle stage of the network to extract local features and global features. An efficient Transformer (ET) module is used in the final stage to fuse the rich feature information. We have conducted extensive experiments on three publicly available medical image classification datasets, and the experimental results show that our proposed Eff-CTM achieves a better trade-off in efficiency and performance than methods based on CNN, Transformer and MLP.
What problem does this paper attempt to address?