Contrastive Language Prompting to Ease False Positives in Medical Anomaly Detection

YeongHyeon Park,Myung Jin Kim,Hyeong Seok Kim
2024-11-12
Abstract:A pre-trained visual-language model, contrastive language-image pre-training (CLIP), successfully accomplishes various downstream tasks with text prompts, such as finding images or localizing regions within the image. Despite CLIP's strong multi-modal data capabilities, it remains limited in specialized environments, such as medical applications. For this purpose, many CLIP variants-i.e., BioMedCLIP, and MedCLIP-SAMv2-have emerged, but false positives related to normal regions persist. Thus, we aim to present a simple yet important goal of reducing false positives in medical anomaly detection. We introduce a Contrastive LAnguage Prompting (CLAP) method that leverages both positive and negative text prompts. This straightforward approach identifies potential lesion regions by visual attention to the positive prompts in the given image. To reduce false positives, we attenuate attention on normal regions using negative prompts. Extensive experiments with the BMAD dataset, including six biomedical benchmarks, demonstrate that CLAP method enhances anomaly detection performance. Our future plans include developing an automated fine prompting method for more practical usage.
Computer Vision and Pattern Recognition,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
This paper attempts to solve the false - positive problem that occurs when using vision - language models (such as CLIP) in medical anomaly detection. Specifically, the authors point out that although pre - trained vision - language models (such as CLIP) perform well in multi - modal data processing, these models still have limitations in professional fields such as medical imaging. In particular, they are prone to misjudging normal regions as abnormal regions, resulting in false - positive results. ### Main objectives of the paper: 1. **Reduce false positives**: By introducing the Contrastive LAnguage Prompting (CLAP) method and using positive and negative text prompts, the accuracy of medical anomaly detection is improved. 2. **Improve anomaly detection performance**: Through extensive experiments on the BMAD dataset, the performance of the CLAP method in multiple biomedical benchmark tests is verified, proving that it can effectively reduce false positives and improve the accuracy of anomaly detection. ### Specific problem description: - **Limitations of existing methods**: Existing CLIP - based variants (such as BioMedCLIP and MedCLIP - SAMv2) have certain improvements in medical image processing, but still have the problem of false positives, that is, wrongly identifying normal regions as abnormal regions. - **Impact of false positives**: False positives may lead to unnecessary medical procedures, increase the burden on the medical system, and may cause harm to patients. ### Solutions: - **CLAP method**: By combining positive and negative text prompts, the CLAP method can accurately identify potential lesion areas in a given image while reducing attention to normal areas, thereby reducing the occurrence of false positives. - **Application of attention mechanism**: Positive prompts guide the model to focus on potential lesion areas, while negative prompts help reduce attention to normal areas, thereby more accurately locating lesion areas. ### Experimental verification: - **Dataset**: The BMAD dataset, which contains six biomedical benchmark tests and covers five anatomical structures. - **Evaluation metric**: AUROC (Area Under the Receiver Operating Characteristic Curve), which is used to evaluate anomaly detection performance. - **Experimental results**: The CLAP method significantly outperforms methods using only positive prompts (such as DINO and PLP) on multiple subsets, especially when dealing with images with small and irregular patterns. ### Summary: The paper proposes a novel CLAP method, aiming to reduce the false - positive problem in medical anomaly detection by combining positive and negative text prompts. The experimental results show that the CLAP method improves the accuracy of anomaly detection in multiple types of medical images and outperforms existing single - prompt methods. Future work will further optimize the automated generation of language prompts to support a wider range of clinical applications.