Deformable Image Registration with Multi-scale Feature Fusion from Shared Encoder, Auxiliary and Pyramid Decoders

Hongchao Zhou,Shunbo Hu
2024-08-11
Abstract:In this work, we propose a novel deformable convolutional pyramid network for unsupervised image registration. Specifically, the proposed network enhances the traditional pyramid network by adding an additional shared auxiliary decoder for image pairs. This decoder provides multi-scale high-level feature information from unblended image pairs for the registration task. During the registration process, we also design a multi-scale feature fusion block to extract the most beneficial features for the registration task from both global and local contexts. Validation results indicate that this method can capture complex deformations while achieving higher registration accuracy and maintaining smooth and plausible deformations.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the complex deformation problem in **Deformable Image Registration (DIR)**, especially in the field of medical image processing. Specifically, the author proposes a new method based on the convolutional pyramid network, aiming to improve the accuracy and smoothness of unsupervised image registration. ### Problem Background Deformable image registration is a fundamental computer vision task and is widely used in medical diagnosis, surgical guidance, disease detection, etc. Its goal is to determine an optimal spatial transformation to align the moving image to the fixed image. Traditional methods usually regard DIR as an optimization problem and minimize the energy function iteratively, but this requires a large amount of computation and a long processing time. In recent years, DIR methods based on deep learning, especially unsupervised methods, have received extensive attention because they do not require labeled data. Pyramid networks perform well in these methods and can achieve registration from coarse to fine. However, existing methods still face challenges in dealing with complex deformations, such as insufficient detail information and insufficient understanding of the global structure. ### Solutions Proposed in the Paper To solve these problems, the author proposes the following innovations: 1. **Shared Auxiliary Decoder**: - Based on the traditional pyramid network, a shared auxiliary decoder is added to provide high - level feature information at multiple scales. - These feature information come from unmerged image pairs, which help enhance the network's understanding of image details and global structure, thereby improving registration accuracy. 2. **Multi - scale Feature Fusion Block (MSFB)**: - An MSFB is designed to receive information from the encoder, the auxiliary decoder, and the coarse deformation field obtained from the previous registration scale. - The MSFB selects the most beneficial features through global and local attention mechanisms, effectively filters redundant information, retains key features, and further improves registration performance. 3. **Improved Loss Function**: - Normalized Cross - Correlation (Lncc) and Deformation Regularization (Lreg) are used to train the network. - The total loss function is expressed as: \[ L = -(\alpha \cdot \text{Lncc}(I_f, I_m \circ \phi) + \beta \cdot \text{Lncc}(I_f, I_m \circ \hat{\phi})) + \lambda \cdot \text{Lreg}(\phi) \] where \(\hat{\phi}\) is the DDF sampled at the \(80\times112\times96\) scale, and \(\alpha\), \(\beta\) and \(\lambda\) are hyperparameters. ### Experimental Results The author verified the effectiveness of this method on the dataset of the Learn2Reg 2024 challenge. The experimental results show that, compared with several learning - based DIR methods, the proposed method performs better in terms of Dice coefficient, target registration error (TRE) and 95% Hausdorff distance (HdDist95), and the generated deformations are smoother and more reasonable. ### Conclusion By introducing the shared auxiliary decoder and the multi - scale feature fusion block, this method not only improves the accuracy of registration, but also ensures the smoothness and rationality of deformation, and is especially suitable for dealing with complex medical image registration tasks.