BCSwinReg: A cross-modal attention network for CBCT-to-CT multimodal image registration

Jieming Zhang,Chang Qing,Yu Li,Yaqi Wang
DOI: https://doi.org/10.1016/j.compbiomed.2024.107990
Abstract:Computed tomography (CT) and cone beam computed tomography (CBCT) registration plays an important role in radiotherapy. However, the poor quality of CBCT makes CBCT-CT multimodal registration challenging. Effective feature fusion and mapping often lead to better registration results for multimodal registration. Therefore, we proposed a new backbone network BCSwinReg and a cross-modal attention module CrossSwin. Specifically, a cross-modal attention CrossSwin is designed to promote multi-modal feature fusion, map the multi-modal domain to the common domain, and thus helping the network learn the correspondence between images better. Furthermore, a new network, BCSwinReg, is proposed to discover correspondence through cross-attention exchange information, obtain multi-level semantic information through a multi-resolution strategy, and finally integrate the deformation of multi-resolutions by the divide-conquer cascade method. We performed experiments on the publicly available 4D-Lung dataset to demonstrate the effectiveness of CrossSwin and BCSwinReg. Compared with VoxelMorph, the BCSwinReg has obtained performance improvements of 3.3% in Dice Similarity Coefficient (DSC) and 0.19 in the average 95% Hausdorff distance (HD95).
What problem does this paper attempt to address?