Multi-class Cancer Classification of Whole Slide Images Through Transformer and Multiple Instance Learning.

Haijing Luan,Taiyuan Hu,Jifang Hu,Ruilin Li,Detao Ji,Jiayin He,Xiaohong Duan,Chunyan Yang,Yajun Gao,Fan Chen,Beifang Niu
DOI: https://doi.org/10.1007/978-981-99-7074-2_12
2023-01-01
Abstract:Whole slide images (WSIs) are high-resolution and lack localized annotations, whose classification can be treated as a multiple instance learning (MIL) problem while slide-level labels are available. We introduce a approach for WSI classification that leverages the MIL and Transformer, effectively eliminating the requirement for localized annotations. Our method consists of three key components. Firstly, we use ResNet50, which has been pre-trained on ImageNet, as an instance feature extractor. Secondly, we present a Transformer-based MIL aggregator that adeptly captures contextual information within individual regions and correlation information among diverse regions within the WSI. Thirdly, we introduce the global average pooling (GAP) layer to increase the mapping relationship between WSI features and category features. To evaluate our model, we conducted experiments on the The Cancer Imaging Archive (TCIA) Clinical Proteomic Tumor Analysis Consortium (CPTAC) dataset. Our proposed method achieves a top-1 accuracy of 94.8% and an area under the curve (AUC) exceeding 0.996, establishing state-of-the-art performance in WSI classification without reliance on localized annotations. The results demonstrate the superiority of our approach compared to previous MIL-based methods.
What problem does this paper attempt to address?