Language Models Meet Anomaly Detection for Better Interpretability and Generalizability

Jun Li,Su Hwan Kim,Philip Müller,Lina Felsner,Daniel Rueckert,Benedikt Wiestler,Julia A. Schnabel,Cosmin I. Bercea
2024-07-23
Abstract:This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model's generalizability to previously unseen medical conditions. The code and dataset are available at <a class="link-external link-https" href="https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Computation and Language
What problem does this paper attempt to address?