Abstract:The heart is a relatively complex non-rigid motion organ in the human body. Quantitative motion analysis of the heart takes on a critical significance to help doctors with accurate diagnosis and treatment. Moreover, cardiovascular magnetic resonance imaging (CMRI) can be used to perform a more detailed quantitative analysis evaluation for cardiac diagnosis. Deformable image registration (DIR) has become a vital task in biomedical image analysis since tissue structures have variability in medical images. Recently, the model based on masked autoencoder (MAE) has recently been shown to be effective in computer vision tasks. Vision Transformer has the context aggregation ability to restore the semantic information in the original image regions by using a low proportion of visible image patches to predict the masked image patches. A novel Transformer-ConvNet architecture is proposed in this study based on MAE for medical image registration. The core of the Transformer is designed as a masked autoencoder (MAE) and a lightweight decoder structure, and feature extraction before the downstream registration task is transformed into the self-supervised learning task. This study also rethinks the calculation method of the multi-head self-attention mechanism in the Transformer encoder. We improve the query-key-value-based dot product attention by introducing both depthwise separable convolution (DWSC) and squeeze and excitation (SE) modules into the self-attention module to reduce the amount of parameter computation to highlight image details and maintain high spatial resolution image features. In addition, concurrent spatial and channel squeeze and excitation (scSE) module is embedded into the CNN structure, which also proves to be effective for extracting robust feature representations. The proposed method, called MAE-TransRNet, has better generalization. The proposed model is evaluated on the cardiac short-axis public dataset (with images and labels) at the 2017 Automated Cardiac Diagnosis Challenge (ACDC). The relevant qualitative and quantitative results (e.g., dice performance and Hausdorff distance) suggest that the proposed model can achieve superior results over those achieved by the state-of-the-art methods, thus proving that MAE and improved self-attention are more effective and promising for medical image registration tasks. Codes and models are available at https://github.com/XinXiao101/MAE-TransRNet .

Advancing Deformable Medical Image Registration with Multi-axis Cross-covariance Attention

Deformable Cross-Attention Transformer for Medical Image Registration

[Cascaded Multi-Level Medical Image Registration Method Based on Transformer].

An Adaptive Region-Based Transformer for Nonrigid Medical Image Registration with a Self-Constructing Latent Graph

XMorpher: Full Transformer for Deformable Medical Image Registration Via Cross Attention

MACG-Net: Multi-axis cross gating network for deformable medical image registration

MD-SGT: Multi-dilation Spherical Graph Transformer for Unsupervised Medical Image Registration.

MAE-TransRNet: An improved transformer-ConvNet architecture with masked autoencoder for cardiac MRI registration

TransMorph: Transformer for unsupervised medical image registration

Multimodal Medical Image Registration Via Common Representations Learning and Differentiable Geometric Constraints

HMDA: A Hybrid Model with Multi-scale Deformable Attention for Medical Image Segmentation

HCS-Net: Multi-level deformation strategy combined with quadruple attention for image registration

NCNet: deformable medical image registration network based on neighborhood cross-attention combined with multi-resolution constraints.

MDH-Net: Advancing 3D Brain MRI Registration with Multi-Stage Transformer and Dual-Stream Feature Refinement Hybrid Network

A Transformer-based Network for Deformable Medical Image Registration

Pulmonary CT Registration Network Based on Deformable Cross Attention

3D Medical Image Registration Based on Simplified Transformer Block and Multi-Scale Iterative Structure

RegMamba: An Improved Mamba for Medical Image Registration

Circularly Deformable Medical Image Registration Based on Transformer-CNN with Prompt

Anatomically Constrained and Attention-Guided Deep Feature Fusion for Joint Segmentation and Deformable Medical Image Registration.

Multi-Resolution Diffeomorphic Image Registration with Convolutional Vision Transformer Network.