Abstract:Image registration is an essential process for aligning features of interest from multiple images. With the recent development of deep learning techniques, image registration approaches have advanced to a new level. In this work, we present 'Rotation-Equivariant network and Transformers for Image Registration' (RoTIR), a deep-learning-based method for the alignment of fish scale images captured by light microscopy. This approach overcomes the challenge of arbitrary rotation and translation detection, as well as the absence of ground truth data. We employ feature-matching approaches based on Transformers and general E(2)-equivariant steerable CNNs for model creation. Besides, an artificial training dataset is employed for semi-supervised learning. Results show RoTIR successfully achieves the goal of fish scale image registration.

What problem does this paper attempt to address?

The paper attempts to address the problem of fish scale image alignment (i.e., image registration). Specifically, the researchers developed a deep learning method called "Rotation Equivariant Network and Transformer" (RoTIR) for the registration of fish scale images captured by optical microscopy. This method aims to overcome the challenges of arbitrary rotation and translation detection while addressing the lack of real labeled data. ### Background and Motivation 1. **Importance of Image Registration**: Image registration is the process of aligning features of interest across multiple images, which is crucial in fields such as medical image analysis and bone regeneration research. 2. **Existing Challenges**: - Fish scale images undergo positional changes during cultivation due to routine addition of culture medium, posing challenges for observation. - Lack of real labeled data makes supervised learning difficult. - The main transformations include translation and rotation, with minimal impact from scaling and distortion. 3. **Research Objective**: Develop a method capable of effectively aligning fish scale images to improve the accuracy of data collection and support research such as bone healing monitoring. ### Method Overview 1. **Network Architecture**: - **Backbone Network**: Based on the classic YOLO model, it converts input images into 16×16 feature maps. E(2)-equivariant convolutional neural networks (CNNs) are used for feature extraction. - **Matching Module**: A local feature transformer module based on the LoFTR model is used for feature point matching. The matching module includes self-attention and cross-attention layers, ultimately outputting rotation angle, scaling factor, and coordinate refinement parameters. 2. **Training Data Synthesis**: - Construct synthetic datasets by cropping original images, generating image pairs, and producing real labels. 3. **Loss Function**: - The total loss is a weighted sum of multiple independent terms, mainly including confidence map loss, rotation angle loss, coordinate refinement loss, and scaling factor loss. ### Experimental Results 1. **Evaluation Metrics**: - Evaluated using the DICE similarity coefficient and Complex Wavelet Structural Similarity Index (CW-SSIM). 2. **Experimental Setup**: - Evaluated using 45 pairs of phase contrast images and their corresponding fish scale masks. 3. **Performance**: - All versions of the RoTIR model achieved high DICE indices (>0.9) and CW-SSIM indices (>0.4). - Coordinate refinement significantly improved registration results. - Rotation detection capability is robust, maintaining consistent detection results under different rotation angles. ### Conclusion The RoTIR model effectively addresses the fish scale image registration problem, particularly excelling in rotation and translation detection. Although its performance in scaling detection is limited, the model demonstrates strong capabilities in handling fish scale images. Future research can further explore the application of RoTIR in broader image registration tasks.

RoTIR: Rotation-Equivariant Network and Transformers for Fish Scale Image Registration

Rotation-Invariant Siamese Network for Low-Altitude Remote-Sensing Image Registration

DD_RoTIR: Dual-Domain Image Registration via Image Translation and Hierarchical Feature-matching

Iterative Scale-Invariant Feature Transform for Remote Sensing Image Registration

Non-iterative Coarse-to-fine Transformer Networks for Joint Affine and Deformable Image Registration

TransDIR: Deformable imaging registration network based on transformer to improve the feature extraction ability

UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration

A Deep Learning Framework for Unsupervised Affine and Deformable Image Registration

Attention for Image Registration (AiR): an unsupervised Transformer approach

An Adaptive Region-Based Transformer for Nonrigid Medical Image Registration with a Self-Constructing Latent Graph

A Transformer-based Network for Deformable Medical Image Registration

Multiscale unsupervised network for deformable image registration

ADRNet: Affine and Deformable Registration Networks for Multimodal Remote Sensing Images

Electron Microscopy Image Registration with Transformers.

3D Biological/Biomedical Image Registration with enhanced Feature Extraction and Outlier Detection

MICDIR: Multi-scale Inverse-consistent Deformable Image Registration using UNetMSS with Self-Constructing Graph Latent

A Plug-and-Play Image Registration Network

RegFSC-Net: Medical Image Registration via Fourier Transform With Spatial Reorganization and Channel Refinement Network

An Image Registration Method Using Deep Residual Network Features for Multisource High-Resolution Remote Sensing Images

Non-rigid retinal image registration using an unsupervised structure-driven regression network

RetinaRegNet: A Zero-Shot Approach for Retinal Image Registration