RoTIR: Rotation-Equivariant Network and Transformers for Fish Scale Image Registration

Ruixiong Wang,Alin Achim,Renata Raele-Rolfe,Qiao Tong,Dylan Bergen,Chrissy Hammond,Stephen Cross
2024-07-27
Abstract:Image registration is an essential process for aligning features of interest from multiple images. With the recent development of deep learning techniques, image registration approaches have advanced to a new level. In this work, we present 'Rotation-Equivariant network and Transformers for Image Registration' (RoTIR), a deep-learning-based method for the alignment of fish scale images captured by light microscopy. This approach overcomes the challenge of arbitrary rotation and translation detection, as well as the absence of ground truth data. We employ feature-matching approaches based on Transformers and general E(2)-equivariant steerable CNNs for model creation. Besides, an artificial training dataset is employed for semi-supervised learning. Results show RoTIR successfully achieves the goal of fish scale image registration.
Image and Video Processing
What problem does this paper attempt to address?
The paper attempts to address the problem of fish scale image alignment (i.e., image registration). Specifically, the researchers developed a deep learning method called "Rotation Equivariant Network and Transformer" (RoTIR) for the registration of fish scale images captured by optical microscopy. This method aims to overcome the challenges of arbitrary rotation and translation detection while addressing the lack of real labeled data. ### Background and Motivation 1. **Importance of Image Registration**: Image registration is the process of aligning features of interest across multiple images, which is crucial in fields such as medical image analysis and bone regeneration research. 2. **Existing Challenges**: - Fish scale images undergo positional changes during cultivation due to routine addition of culture medium, posing challenges for observation. - Lack of real labeled data makes supervised learning difficult. - The main transformations include translation and rotation, with minimal impact from scaling and distortion. 3. **Research Objective**: Develop a method capable of effectively aligning fish scale images to improve the accuracy of data collection and support research such as bone healing monitoring. ### Method Overview 1. **Network Architecture**: - **Backbone Network**: Based on the classic YOLO model, it converts input images into 16×16 feature maps. E(2)-equivariant convolutional neural networks (CNNs) are used for feature extraction. - **Matching Module**: A local feature transformer module based on the LoFTR model is used for feature point matching. The matching module includes self-attention and cross-attention layers, ultimately outputting rotation angle, scaling factor, and coordinate refinement parameters. 2. **Training Data Synthesis**: - Construct synthetic datasets by cropping original images, generating image pairs, and producing real labels. 3. **Loss Function**: - The total loss is a weighted sum of multiple independent terms, mainly including confidence map loss, rotation angle loss, coordinate refinement loss, and scaling factor loss. ### Experimental Results 1. **Evaluation Metrics**: - Evaluated using the DICE similarity coefficient and Complex Wavelet Structural Similarity Index (CW-SSIM). 2. **Experimental Setup**: - Evaluated using 45 pairs of phase contrast images and their corresponding fish scale masks. 3. **Performance**: - All versions of the RoTIR model achieved high DICE indices (>0.9) and CW-SSIM indices (>0.4). - Coordinate refinement significantly improved registration results. - Rotation detection capability is robust, maintaining consistent detection results under different rotation angles. ### Conclusion The RoTIR model effectively addresses the fish scale image registration problem, particularly excelling in rotation and translation detection. Although its performance in scaling detection is limited, the model demonstrates strong capabilities in handling fish scale images. Future research can further explore the application of RoTIR in broader image registration tasks.