Abstract:Background: With the rapid advancement of medical imaging technologies, precise image analysis and diagnosis play a crucial role in enhancing treatment outcomes and patient care. Computed tomography (CT) and magnetic resonance imaging (MRI), as pivotal technologies in medical imaging, exhibit unique advantages in bone imaging and soft tissue contrast, respectively. However, cross-domain medical image registration confronts significant challenges due to the substantial differences in contrast, texture, and noise levels between different imaging modalities. Purpose: The purpose of this study is to address the major challenges encountered in the field of cross-domain medical image registration by proposing a spatial-aware contrastive learning approach that effectively integrates shared information from CT and MRI images. Our objective is to optimize the feature space representation by employing advanced reconstruction and contrastive loss functions, overcoming the limitations of traditional registration methods when dealing with different imaging modalities. Through this approach, we aim to enhance the model's ability to learn structural similarities across domain images, improve registration accuracy, and provide more precise imaging analysis tools for clinical diagnosis and treatment planning. Methods: With prior knowledge that different domains of images (CT and MRI) share same content-style information, we extract equivalent feature spaces from both images, enabling accurate cross-domain point matching. We employ a structure resembling that of an autoencoder, augmented with designed reconstruction and contrastive losses to fulfill our objectives. We also propose region mask to solve the conflict between spatial correlation and distinctiveness, to obtain a better representation space. Results: Our research results demonstrate the significant superiority of the proposed spatial-aware contrastive learning approach in the domain of cross-domain medical image registration. Quantitatively, our method achieved an average Dice similarity coefficient (DSC) of 85.68%, target registration error (TRE) of 1.92 mm, and mean Hausdorff distance (MHD) of 1.26 mm, surpassing current state-of-the-art methods. Additionally, the registration processing time was significantly reduced to 2.67 s on a GPU, highlighting the efficiency of our approach. The experimental outcomes not only validate the effectiveness of our method in improving the accuracy of cross-domain image registration but also prove its adaptability across different medical image analysis scenarios, offering robust support for enhancing diagnostic precision and patient treatment outcomes. Conclusions: The spatial-aware contrastive learning approach proposed in this paper introduces a new perspective and solution to the domain of cross-domain medical image registration. By effectively optimizing the feature space representation through carefully designed reconstruction and contrastive loss functions, our method significantly improves the accuracy and stability of registration between CT and MRI images. The experimental results demonstrate the clear advantages of our approach in enhancing the accuracy of cross-domain image registration, offering significant application value in promoting precise diagnosis and personalized treatment planning. In the future, we look forward to further exploring the application of this method in a broader range of medical imaging datasets and its potential integration with other advanced technologies, contributing more innovations to the field of medical image analysis and processing.

BCSwinReg: A cross-modal attention network for CBCT-to-CT multimodal image registration

NCNet: deformable medical image registration network based on neighborhood cross-attention combined with multi-resolution constraints.

A joint learning framework for multisite CBCT-to-CT translation using a hybrid CNN-transformer synthesizer and a registration network

ACSwinNet: A Deep Learning-Based Rigid Registration Method for Head-Neck CT-CBCT Images in Image-Guided Radiotherapy

Cross-modal Attention for MRI and Ultrasound Volume Registration

ACSGRegNet: A Deep Learning-based Framework for Unsupervised Joint Affine and Diffeomorphic Registration of Lumbar Spine CT via Cross- and Self-Attention Fusion

Unsupervised learning for deformable registration of thoracic CT and cone‐beam CT based on multiscale features matching with spatially adaptive weighting

Multimodal Medical Image Registration Via Common Representations Learning and Differentiable Geometric Constraints

Multimodal registration network with multi-scale feature-crossing

CCMNet: Cross-scale correlation-aware mapping network for 3D lung CT image registration

MvMM-RegNet: A new image registration framework based on multivariate mixture model and neural network estimation

Cross-Modal Information-Guided Network using Contrastive Learning for Point Cloud Registration

MFCTrans: Multi-scale Feature Connection Transformer for Deformable Medical Image Registration

Spatial-aware contrastive learning for cross-domain medical image registration

Cross-Modality Image Registration Using a Training-Time Privileged Third Modality

Deep learning-based 3D brain multimodal medical image registration

Stepwise Corrected Attention Registration Network for Preoperative and Follow-Up Magnetic Resonance Imaging of Glioma Patients

SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images

Weakly-supervised convolutional neural networks for multimodal image registration

Anatomically Constrained and Attention-Guided Deep Feature Fusion for Joint Segmentation and Deformable Medical Image Registration.