Snuffy: Efficient Whole Slide Image Classifier

Hossein Jafarinia,Alireza Alipanah,Danial Hamdi,Saeed Razavi,Nahal Mirzaie,Mohammad Hossein Rohban
2024-08-20
Abstract:Whole Slide Image (WSI) classification with multiple instance learning (MIL) in digital pathology faces significant computational challenges. Current methods mostly rely on extensive self-supervised learning (SSL) for satisfactory performance, requiring long training periods and considerable computational resources. At the same time, no pre-training affects performance due to domain shifts from natural images to WSIs. We introduce Snuffy architecture, a novel MIL-pooling method based on sparse transformers that mitigates performance loss with limited pre-training and enables continual few-shot pre-training as a competitive option. Our sparsity pattern is tailored for pathology and is theoretically proven to be a universal approximator with the tightest probabilistic sharp bound on the number of layers for sparse transformers, to date. We demonstrate Snuffy's effectiveness on CAMELYON16 and TCGA Lung cancer datasets, achieving superior WSI and patch-level accuracies. The code is available on <a class="link-external link-https" href="https://github.com/jafarinia/snuffy" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Neural and Evolutionary Computing,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in digital pathology, the significant computational challenges faced when classifying whole - slide images (WSIs) using multiple - instance learning (MIL). Specifically: 1. **High demand for computational resources**: Most current methods rely on self - supervised learning (SSL), which requires long - term training and a large amount of computational resources. 2. **Performance degradation**: Lack of pre - training or insufficient pre - training will lead to performance degradation because there are domain differences between natural image datasets (such as ImageNet - 1K) and WSIs. To solve these problems, the authors propose the Snuffy architecture, a new MIL pooling method based on sparse transformers. The main features of the Snuffy architecture include: - **Reducing computational requirements**: Through sparse transformers and continual few - shot self - supervised pre - training, the computational resources required for training embeddings are greatly reduced. - **Enhancing expressive ability**: A new bio - driven sparse pattern is introduced to ensure its ability as a universal approximator, and it is theoretically proven to have the strictest probability bounds. - **Supporting continual few - shot pre - training**: Making continual few - shot pre - training a viable and competitive option, balancing efficiency and performance. Specifically, the Snuffy architecture has demonstrated excellent WSI and patch - level accuracy on the CAMELYON16 and TCGA lung cancer datasets, and has reached a new state - of - the - art level in multiple - instance learning (MIL) tasks. ### Summary of main contributions: 1. **Continual self - supervised pre - training**: Continual SSL pre - training from the ImageNet - 1K pre - training model to the pathological dataset, using adapters to significantly reduce the pre - training computation time. 2. **New bio - driven sparse pattern**: A new strictly bounded probability is introduced to ensure its ability as a universal approximator. 3. **Significantly improving WSI classification metrics**: Achieving new state - of - the - art results in WSI classification (AUC 0.987) and ROI detection (FROC 0.675). 4. **Extensive verification**: Verified on multiple recognized benchmark datasets, demonstrating its consistent and superior performance. These improvements make the Snuffy architecture not only perform well in WSI classification tasks, but also have great potential in clinical applications.