Unpaired Cross-Modal Interaction Learning for COVID-19 Segmentation on Limited CT Images

Qingbiao Guan,Yutong Xie,Bing Yang,Jianpeng Zhang,Zhibin Liao,Qi Wu,Yong Xia
DOI: https://doi.org/10.1007/978-3-031-43898-1_58
2023-01-01
Abstract:Accurate automated segmentation of infected regions in CT images is crucial for predicting COVID-19's pathological stage and treatment response. Although deep learning has shown promise in medical image segmentation, the scarcity of pixel-level annotations due to their expense and time-consuming nature limits its application in COVID-19 segmentation. In this paper, we propose utilizing large-scale unpaired chest X-rays with classification labels as a means of compensating for the limited availability of densely annotated CT scans, aiming to learn robust representations for accurate COVID-19 segmentation. To achieve this, we design an Unpaired Cross-modal Interaction (UCI) learning framework. It comprises a multi-modal encoder, a knowledge condensation (KC) and knowledge-guided interaction (KI) module, and task-specific networks for final predictions. The encoder is built to capture optimal feature representations for bothCTandX-ray images. To facilitate information interaction between unpaired cross-modal data, we propose the KC that introduces a momentum-updated prototype learning strategy to condense modality-specific knowledge. The condensed knowledge is fed into the KI module for interaction learning, enabling the UCI to capture critical features and relationships across modalities and enhance its representation ability for COVID-19 segmentation. The results on the public COVID-19 segmentation benchmark show that our UCI with the inclusion of chest X-rays can significantly improve segmentation performance, outperforming advanced segmentation approaches including nnUNet, CoTr, nnFormer, and Swin UNETR. Code is available at: https://github.com/GQBBBB/UCI.
What problem does this paper attempt to address?