Bailiang Jian,Jiazhen Pan,Morteza Ghahremani,Daniel Rueckert,Christian Wachinger,Benedikt Wiestler
Abstract:Our findings indicate that adopting "advanced" computational elements fails to significantly improve registration accuracy. Instead, well-established registration-specific designs offer fair improvements, enhancing results by a marginal 1.5\% over the baseline. Our findings emphasize the importance of rigorous, unbiased evaluation and contribution disentanglement of all low- and high-level registration components, rather than simply following the computer vision trends with "more advanced" computational blocks. We advocate for simpler yet effective solutions and novel evaluation metrics that go beyond conventional registration accuracy, warranting further research across diverse organs and modalities. The code is available at \url{<a class="link-external link-https" href="https://github.com/BailiangJ/rethink-reg" rel="external noopener nofollow">this https URL</a>}.
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are: Can the adoption of the latest computing modules (such as Mamba, Transformer, etc.) significantly improve the accuracy of medical image registration, or is it just a "bandwagon" behavior? In addition, the role of classic high - level registration - specific designs (such as pyramid mechanisms, correlation calculations, and iterative optimizations) in improving registration performance also deserves to be re - examined. Specifically, the paper aims to answer the following two key questions through rigorous experimental evaluations:
1. Can more advanced low - level computing modules (such as Transformer or Mamba) really enhance the effect of medical image registration?
2. If following the latest trend does not bring substantial improvement, then what are the factors that can really improve the effect of medical image registration?
### Research Background and Motivation
In recent years, many new deep - learning - based methods have emerged in the field of medical image registration. For example, VoxelMorph uses convolutional neural networks (CNNs) for registration tasks, and TransMorph introduces attention mechanisms. Recently, the Mamba model has further improved the registration efficiency through selective state - space models (SSMs). However, whether these new methods have really brought about a significant performance improvement or are just the product of the technological trend still needs to be verified.
### Core Contributions of the Paper
1. **Modular Component Analysis**: The researchers disassembled each computing module in a modular way and conducted fair comparison experiments to isolate the specific contribution of each component to the final registration result.
2. **Extensive Experimental Evaluation**: The experiments were carried out on five open - source brain MRI datasets, ensuring the consistency of the experimental setup and the reliability of the results.
3. **Recommended Directions**: The research shows that "advanced" low - level computing modules (such as Transformer and Mamba) perform worse than traditional CNN methods in brain MRI registration tasks. On the contrary, high - level registration - specific designs (such as correlation layers and iterative deformation pyramids) are more effective in improving registration performance. Therefore, the researchers suggest that the community should pay more attention to these registration - specific designs instead of simply transplanting "advanced" computing modules.
### Main Findings
- Using more advanced low - level computing modules (such as Mamba or Transformer) does not significantly improve registration accuracy, and may even deteriorate performance in some cases.
- In contrast, combining high - level registration - specific designs (such as coarse - to - fine pyramid mechanisms, correlation layers, and iterative optimizations) can slightly improve registration performance, with an approximately 1.5% Dice coefficient improvement.
- The basic Voxelmorph method has been able to produce competitive registration results, almost indistinguishable from the best model.
### Conclusions
The paper emphasizes the importance of simplified design and calls on the community to be cautious in the development of complex and trend - following registration architectures. Simplified design often can produce comparable or even better results. At the same time, the researchers advocate the development of new evaluation metrics, beyond the traditional registration accuracy scores, to better reflect the subtle differences between different models. In addition, it is also recommended to focus on more substantial and influential research areas such as real - time performance, data efficiency, robustness, generalization ability, patient - specific adaptation, and the interpretability of deep - learning models.
Through these findings, the researchers hope to guide future research to pay more attention to practical effects and the rationality of innovation, rather than blindly following technological trends.