Deep Cross-View Reconstruction GAN Based on Correlated Subspace for Multi-View Transformation

Jian-Xun Mi,Junchang He,Weisheng Li
DOI: https://doi.org/10.1109/TIP.2024.3442610
Abstract:In scenarios where identifying face information in the visible spectrum (VIS) is challenging due to poor lighting conditions, the use of near-infrared (NIR) and thermal (TH) cameras can provide viable alternatives. However, the unique data distribution of images captured by these cameras compared to VIS images presents challenges in matching face identities. To address these challenges, we propose a novel image transformation framework. The framework includes feature extraction from the input image, followed by a transformation network that generates target domain images with perceptual fidelity. Additionally, a reconstruction network preserves original information by reconstructing the original domain image from the extracted features. By considering the correlation between features from both domains, our framework utilizes paired data obtained from the same individual. We apply this framework to two well-established image-to-image transformation models, pix2pix and CycleGAN, known as CRC-pix2pix and CRC-CycleGAN respectively. The versatility of our approach allows extension to other models based on pix2pix or CycleGAN architectures. Our models generate high-quality images while preserving the identity information of the original face. Performance evaluation on TFW and BUAA NIR-VIS datasets demonstrates the superiority of our models in terms of generated image face matching and evaluation metrics such as SSIM, MSE, PSNR, and LPIPS. Moreover, we introduce the CQUPT-VIS-TH dataset, which enriches the paired dataset with thermal-visual face data capturing various angles and expressions.
What problem does this paper attempt to address?