Free Lunch in Pathology Foundation Model: Task-specific Model Adaptation with Concept-Guided Feature Enhancement

Yanyan Huang,Weiqin Zhao,Yihang Chen,Yu Fu,Lequan Yu
2024-11-15
Abstract:Whole slide image (WSI) analysis is gaining prominence within the medical imaging field. Recent advances in pathology foundation models have shown the potential to extract powerful feature representations from WSIs for downstream tasks. However, these foundation models are usually designed for general-purpose pathology image analysis and may not be optimal for specific downstream tasks or cancer types. In this work, we present Concept Anchor-guided Task-specific Feature Enhancement (CATE), an adaptable paradigm that can boost the expressivity and discriminativeness of pathology foundation models for specific downstream tasks. Based on a set of task-specific concepts derived from the pathology vision-language model with expert-designed prompts, we introduce two interconnected modules to dynamically calibrate the generic image features extracted by foundation models for certain tasks or cancer types. Specifically, we design a Concept-guided Information Bottleneck module to enhance task-relevant characteristics by maximizing the mutual information between image features and concept anchors while suppressing superfluous information. Moreover, a Concept-Feature Interference module is proposed to utilize the similarity between calibrated features and concept anchors to further generate discriminative task-specific features. The extensive experiments on public WSI datasets demonstrate that CATE significantly enhances the performance and generalizability of MIL models. Additionally, heatmap and umap visualization results also reveal the effectiveness and interpretability of CATE. The source code is available at <a class="link-external link-https" href="https://github.com/HKU-MedAI/CATE" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of poor performance of basic pathology models in specific downstream tasks or cancer - type analysis. Specifically, although existing basic pathology models can extract powerful feature representations from whole slide images (WSI), these models are usually designed for general pathological image analysis and are not optimized for specific tasks or cancer types. Therefore, the features they extract may contain information irrelevant to the task, which will impair the performance of specific downstream tasks. To overcome this challenge, the authors propose a new method named **Concept Anchor - guided Task - specific Feature Enhancement (CATE)**. CATE enhances the basic pathology model in the following ways: 1. **Concept - guided Information Bottleneck (CIB)**: This module enhances task - related features by maximizing the mutual information between image features and concept anchors while minimizing redundant information. 2. **Concept - Feature Interference (CFI)**: This module uses the similarity between calibrated image features and concept anchors to generate more discriminative task - specific features. Through these two modules, CATE can improve the expressive and discriminative abilities of existing basic pathology models on specific tasks without adding additional supervision or significant computational resources, thereby improving the performance and generalization ability of multi - instance learning (MIL) models. ### Formula summary - **Overall objective function of CATE**: \[ L = L_{CE}+\lambda_P L_{PIM}+\lambda_S L_{SIM} \] where: - \(L_{CE}\) is the cross - entropy loss for downstream tasks. - \(L_{PIM}\) is the predicted information maximization loss. - \(L_{SIM}\) is the redundant information minimization loss. - \(\lambda_P\) and \(\lambda_S\) are hyperparameters. - **Predicted information maximization (PIM) loss**: \[ L_{PIM}=E_{\hat{\alpha}, ccs^{pos}}\left[-k\sum_{i = 1}\frac{\hat{\alpha}_i^T ccs^{pos}}{\tau}\right]+E_{\hat{\alpha}, ccs^{pos}}\left[k\sum_{i = 1}\log\left(\sum_{j = 1}^m\exp\left(\frac{\hat{\alpha}_i^T ccs_j}{\tau}\right)+\sum_{j = 1}^n\exp\left(\frac{\hat{\alpha}_i^T cca_j}{\tau}\right)\right)\right] \] - **Redundant information minimization (SIM) loss**: \[ L_{SIM}=E\left[\sum_{i = 1}^k D_{KL}(q_\theta(\alpha_i|x_i)\|r(\alpha_i))\right] \] Through these formulas and modules, CATE effectively enhances the discriminative ability of the original features and aligns them with task - specific concept anchors to improve prediction performance.