EsccNet: A Hybrid CNN and Transformers Model for the Classification of Whole Slide Images of Esophageal Squamous Cell Carcinoma

Zhaoxin Kang,Hejun Zhang,Mingqiu Chen,Xiangwen Liao
DOI: https://doi.org/10.1109/iccea62105.2024.10604222
2024-01-01
Abstract:This study presents a novel approach in the application of deep learning for the classification of esophageal squamous cell carcinoma (Escc) using whole-slide images (WSIs). Our methodology uniquely combines Convolutional Neural Network (CNN) with Transformer, leveraging the strengths of both architectures to enhance the accuracy and efficiency of cancer detection and classification in histopathological images. In this research, we first preprocess a substantial dataset of WSI samples, annotated by expert pathologists, to train and validate our model. The CNN component effectively extracts detailed local features from the high-resolution images, while the Transformer, known for its capability in handling sequential data, adeptly manages the global context, addressing the challenges posed by the complex and heterogeneous nature of WSIs. The accuracy, F1 score, recall, and precision of our proposed model on the dataset provided by Fujian Cancer Hospital are 94.71%, 94.32%, 94.68%, and 94.08%, respectively, which are significantly better than other models. This study not only assists pathologists in analyzing esophageal squamous carcinoma WSIs but also paves the way for further research into the combined application of CNN and Transformer in the diagnosis of other types of cancer.
What problem does this paper attempt to address?