Robust COVID-19 Detection in CT Images with CLIP

Li Lin,Yamini Sri Krubha,Zhenhuan Yang,Cheng Ren,Thuc Duy Le,Irene Amerini,Xin Wang,Shu Hu
2024-03-15
Abstract:In the realm of medical imaging, particularly for COVID-19 detection, deep learning models face substantial challenges such as the necessity for extensive computational resources, the paucity of well-annotated datasets, and a significant amount of unlabeled data. In this work, we introduce the first lightweight detector designed to overcome these obstacles, leveraging a frozen CLIP image encoder and a trainable multilayer perception (MLP). Enhanced with Conditional Value at Risk (CVaR) for robustness and a loss landscape flattening strategy for improved generalization, our model is tailored for high efficacy in COVID-19 detection. Furthermore, we integrate a teacher-student framework to capitalize on the vast amounts of unlabeled data, enabling our model to achieve superior performance despite the inherent data limitations. Experimental results on the COV19-CT-DB dataset demonstrate the effectiveness of our approach, surpassing baseline by up to 10.6% in `macro' F1 score in supervised learning. The code is available at
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper focuses on how to solve the challenges in medical image analysis in COVID-19 testing through deep learning models, especially based on CT scan images. The current problems include the need for a large amount of computing resources, lack of well-annotated datasets, and the presence of a large amount of unlabeled data. The paper proposes a lightweight detector that utilizes a frozen CLIP image encoder and a trainable multi-layer perception (MLP), combined with Conditional Value-at-Risk (CVaR) for enhanced robustness, and a smooth loss landscape strategy to improve generalization ability. In addition, they adopt a teacher-student framework to leverage a large amount of unlabeled data, achieving high performance even with limited data. Specifically, the main contributions of the paper are as follows: 1. The design of the first lightweight COVID-19 detector based on annotated 3-D CT scans. 2. The proposal of a teacher-student framework to improve COVID-19 detection performance using unlabeled data. 3. Experimental results show that the proposed method outperforms baseline methods on the COV19-CT-DB dataset, with a maximum improvement of 10.6% in macro F1 score. The paper also discusses the challenges faced by existing deep learning methods, such as data scarcity, utilization of unlabeled data, and improvement of model generalization ability, and proposes solutions through CVaR loss and loss landscape flattening. Furthermore, the authors conducted a series of experiments to demonstrate the effectiveness of the proposed approach.