CoD-MIL: Chain-of-Diagnosis Prompting Multiple Instance Learning for Whole Slide Image Classification.

Chen Li,Jiangbo Shi,Huazhu Fu,Chunbao Wang,Tieliang Gong
DOI: https://doi.org/10.1109/TMI.2024.3485120
IF: 10.6
2024-10-23
IEEE Transactions on Medical Imaging
Abstract:Multiple instance learning (MIL) has emerged as a prominent paradigm for processing the whole slide image with pyramid structure and giga-pixel size in digital pathology. However, existing attention-based MIL methods are primarily trained on the image modality and a pre-defined label set, leading to limited generalization and interpretability. Recently, vision language models (VLM) have achieved promising performance and transferability, offering potential solutions to the limitations of MIL-based methods. Pathological diagnosis is an intricate process that requires pathologists to examine the WSI step-by-step. In the field of natural language process, the chain-of-thought (CoT) prompting method is widely utilized to imitate the human reasoning process. Inspired by the CoT prompt and pathologists' clinic knowledge, we propose a chain-of-diagnosis prompting multiple instance learning (CoD-MIL) framework for whole slide image classification. Specifically, the chain-of-diagnosis text prompt decomposes the complex diagnostic process in WSI into progressive sub-processes from low to high magnification. Additionally, we propose a text-guided contrastive masking module to accurately localize the tumor region by masking the most discriminative instances and introducing the guidance of normal tissue texts in a contrastive way. Extensive experiments conducted on three real-world subtyping datasets demonstrate the effectiveness and superiority of CoD-MIL.
Medicine,Computer Science
What problem does this paper attempt to address?