Light Field Reconstruction via Deep Adaptive Fusion of Hybrid Lenses

Jing Jin,Mantang Guo,Junhui Hou,Hui Liu,Hongkai Xiong
2023-06-18
Abstract:This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input from two complementary and parallel perspectives. Specifically, one module regresses a spatially consistent intermediate estimation by learning a deep multidimensional and cross-domain feature representation, while the other module warps another intermediate estimation, which maintains the high-frequency textures, by propagating the information of the high-resolution view. We finally leverage the advantages of the two intermediate estimations adaptively via the learned attention maps, leading to the final high-resolution LF image with satisfactory results on both plain textured areas and depth discontinuous boundaries. Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy. Extensive experiments on both real and simulated hybrid data demonstrate the significant superiority of our approach over state-of-the-art ones. To the best of our knowledge, this is the first end-to-end deep learning method for LF reconstruction from a real hybrid input. We believe our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to reconstruct high - quality light - field (LF) images from a hybrid - lens system (including one high - resolution camera and multiple low - resolution cameras). Existing methods will produce blurry results when dealing with planar - textured areas or cause distortion around depth - discontinuous boundaries. To address this challenge, the authors propose a new end - to - end learning method that can fully utilize the specific features of the input data and process from two complementary and parallel perspectives. Specifically: 1. **SR - Net**: By learning deep multi - dimensional and cross - domain feature representations, it regresses spatially consistent intermediate estimates. 2. **Warp - Net**: By propagating information from high - resolution views, it warps another intermediate estimate that retains high - frequency textures. Finally, by learning a confidence map, the advantages of these two intermediate estimates are adaptively combined to generate the final high - resolution light - field image, achieving satisfactory results on both planar - textured areas and depth - discontinuous boundaries. In addition, to improve the effectiveness of this method on real - world mixed data, the authors carefully design the network architecture and training strategy. A large number of experimental results show that this method significantly outperforms existing methods on both real - world and simulated mixed data. This is the first deep - learning method for end - to - end reconstruction of light - field images from real - world mixed inputs, which is expected to reduce the cost of high - resolution light - field data acquisition and is beneficial for the storage and transmission of light - field data.