Abstract:Fine-grained classification of whole slide images (WSIs) is essential in precision oncology, enabling precise cancer diagnosis and personalized treatment strategies. The core of this task involves distinguishing subtle morphological variations within the same broad category of gigapixel-resolution images, which presents a significant challenge. While the multi-instance learning (MIL) paradigm alleviates the computational burden of WSIs, existing MIL methods often overlook hierarchical label correlations, treating fine-grained classification as a flat multi-class classification task. To overcome these limitations, we introduce a novel hierarchical multi-instance learning (HMIL) framework. By facilitating on the hierarchical alignment of inherent relationships between different hierarchy of labels at instance and bag level, our approach provides a more structured and informative learning process. Specifically, HMIL incorporates a class-wise attention mechanism that aligns hierarchical information at both the instance and bag levels. Furthermore, we introduce supervised contrastive learning to enhance the discriminative capability for fine-grained classification and a curriculum-based dynamic weighting module to adaptively balance the hierarchical feature during training. Extensive experiments on our large-scale cytology cervical cancer (CCC) dataset and two public histology datasets, BRACS and PANDA, demonstrate the state-of-the-art class-wise and overall performance of our HMIL framework. Our source code is available at <a class="link-external link-https" href="https://github.com/ChengJin-git/HMIL" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in the fine - grained classification of whole - slide images (WSIs), existing methods fail to fully utilize the label hierarchy information. Specifically: 1. **Fine - grained Classification Challenges**: In precision oncology, the fine - grained classification of WSIs is crucial for accurate cancer diagnosis and personalized treatment strategies. This requires distinguishing subtle morphological changes within the same broad category, and these images are usually at gigapixel - level resolution, making this task extremely challenging. 2. **Limitations of Multi - Instance Learning (MIL)**: Although MIL methods reduce the computational burden of WSIs, existing MIL methods usually ignore the correlations between hierarchical labels, treating fine - grained classification as a flat multi - class classification task and being unable to effectively utilize the hierarchical structure information of labels. To solve these problems, the authors propose a novel hierarchical multi - instance learning framework (Hierarchical Multi - Instance Learning, HMIL). HMIL provides a more structured and informative learning process by hierarchically aligning the inherent relationships between different - level labels at the instance and bag levels. Specific contributions include: - **Introducing a Hierarchical Attention Mechanism**: Introducing category - level attention mechanisms at the instance and bag levels to align hierarchical information. - **Supervised Contrastive Learning**: Enhancing the discriminative ability of fine - grained classification. - **Curriculum - based Dynamic Weighting Module**: Adaptively balancing the influence of hierarchical features during the training process. Through these improvements, the HMIL framework can more effectively handle the fine - grained classification problem of WSIs and improve the model's ability to distinguish subtle cancerous subtypes. ### Formula Summary 1. **Cross - Entropy Loss Function**: \[ L_{ce}^{(c,f)}=-\sum_{i = 1}^{K_{c,f}}Y_i\log(\hat{Y}_i) \] where \(Y\) is the true label, \(\hat{Y}\) is the predicted probability distribution, and \(K_{c,f}\) is the number of classes. 2. **Instance - level Hierarchical Alignment Loss**: \[ L_{ia}=\frac{1}{N_i}(1 - \cos(A_{i,c},MA_{i,f})) \] where \(\cos\) represents cosine similarity, and \(M\) is a mapping matrix that converts fine - grained attention scores to a coarse - grained hierarchy. 3. **Bag - level Hierarchical Alignment Loss**: \[ L_{ba}=-\sum_{i = 1}^{K_c}Y_i^{(c)}\log(\tilde{Y}_i^{(c)}) \] where \(Y_i^{(c)}\) is the true label of the coarse - grained category, and \(\tilde{Y}^{(c)} = Mp_f\) is the predicted coarse - grained probability derived from the fine - grained probability through the mapping matrix. 4. **Supervised Contrastive Loss**: \[ L_{reg}=\sum_{i = 1}^b\frac{- 1}{|P_i|}\sum_{B_{p,f}\in P_i}\log\frac{\exp(B_{i,f}\cdot B_{p,f}^\top/\tau)}{\sum_{B_{o,f}\in V_i}\exp(B_{i,f}\cdot B_{o,f}^\top/\tau)} \] where \(V_i\) is the set of features in the current batch except \(B_{i,f}\), \(P_i\) is the set of features with the same fine - grained label, and the temperature hyperparameter \(\tau\) is set to 0.1. 5. **Total Loss Function**: \[ L=\beta\cdot(L_{ce}^{(c)}+L_{ia}+L_{ba})+(1 -

HMIL: Hierarchical Multi-Instance Learning for Fine-Grained Whole Slide Image Classification

Iterative multiple instance learning for weakly annotated whole slide image classification

Multiple Instance Learning Framework with Masked Hard Instance Mining for Whole Slide Image Classification

Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis

Iteratively Coupled Multiple Instance Learning from Instance to Bag Classifier for Whole Slide Image Classification

Dual-Attention Multiple Instance Learning Framework for Pathology Whole-Slide Image Classification

DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification

The Whole Pathological Slide Classification via Weakly Supervised Learning

Exploring Visual Prompts for Whole Slide Image Classification with Multiple Instance Learning

CaMIL: Causal Multiple Instance Learning for Whole Slide Image Classification

Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification

Dynamic Policy-Driven Adaptive Multi-Instance Learning for Whole Slide Image Classification

Weakly supervised histopathology cancer image segmentation and classification

IIB-MIL: Integrated Instance-Level and Bag-Level Multiple Instances Learning with Label Disambiguation for Pathological Image Analysis

Bayesian Collaborative Learning for Whole-Slide Image Classification

SAM-MIL: A Spatial Contextual Aware Multiple Instance Learning Approach for Whole Slide Image Classification

Cross-scale Multi-instance Learning for Pathological Image Diagnosis

RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

Interventional Bag Multi-Instance Learning on Whole-Slide Pathological Images