VisIRNet: Deep Image Alignment for UAV-Taken Visible and Infrared Image Pairs

Sedat Özer,Alain P. Ndigande
DOI: https://doi.org/10.1109/tgrs.2024.3367986
IF: 8.2
2024-03-09
IEEE Transactions on Geoscience and Remote Sensing
Abstract:This article proposes a deep-learning-based solution for multimodal image alignment regarding unmanned aerial vehicle (UAV)-taken images. Many recently proposed state-of-the-art alignment techniques rely on using Lucas–Kanade (LK)-based solutions for a successful alignment. However, we show that we can achieve state-of-the-art results without using LK-based methods. Our approach carefully utilizes a two-branch-based convolutional neural network (CNN) based on feature embedding blocks. We propose two variants of our approach, where in the first variant (Model A), we directly predict the new coordinates of only the four corners of the image to be aligned; and in the second one (Model B), we predict the homography matrix directly. Applying alignment on the image corners forces the algorithm to match only those four corners as opposed to computing and matching many (key) points, since the latter may cause many outliers, yielding less accurate alignment. We test our proposed approach on four aerial datasets and obtain state-of-the-art results when compared to the existing recent deep LK-based architectures.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?