CViTS-Net: A CNN-ViT Network With Skip Connections for Histopathology Image Classification

Anusree Kanadath,J. Angel Arul Jothi,Siddhaling Urolagin
DOI: https://doi.org/10.1109/access.2024.3448302
IF: 3.9
2024-08-31
IEEE Access
Abstract:Histopathological image classification stands as a cornerstone in the pathological diagnosis workflow, yet it remains challenging due to the inherent complexity of histopathological images. Recently, transformers and convolutional neural network (CNN) - based deep models have shown promising results in the automatic histopathology image classification. Transformers excel at capturing global dependencies within the image content, while CNNs effectively extract local features. In this study, we introduce the CViTS-Net model, a novel deep learning architecture that combines CNNs with vision transformer (ViT), enhanced by innovative skip connections. This fusion allows our model to capture both local and global dependencies within histopathological images. Extensive experiments were conducted, including holdout validation and cross-validation, comparing CViTS-Net with several state-of-the-art CNN, ViT, and attention-based methods on the Chaoyang histopathology dataset. Furthermore, we evaluated the model's generalizability and robustness by testing it on large and diverse datasets such as lymphoma dataset and invasive ductal carcinoma (IDC) dataset. Our model achieves remarkable classification accuracy of 96.06% on the Chaoyang dataset, 99.61% accuracy on the lymphoma dataset, and 95% accuracy on IDC dataset, surpassing state-of-the-art deep learning models while maintaining superior efficiency. The CViTS-Net model showcases outstanding classification performance, underscoring its potential to significantly aid pathologists in histopathological diagnosis.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?