End-to-End Semantic Segmentation Utilizing Multi-scale Baseline Light Field

Ruixuan Cong,Hao Sheng,Dazhi Yang,Da Yang,Rongshan Chen,Sizhe Wang,Zhenglong Cui
DOI: https://doi.org/10.1109/tcsvt.2024.3367370
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Semantic segmentation based on 4D light field (LF) images exhibits superior performance by exploiting rich spatial and angular information. However, current methods only focus on narrow-baseline cases, ignoring the feasibility and capability of large disparity scene for segmentation. Motivated by this, we propose a novel network called LF-IENet++ suitable for both narrow-baseline LF and wide-baseline LF in this paper, which fully mines complementary information across views via implicit feature integration and explicit feature propagation. In order to concentrate on inconsistent context between view images during feature integration, we shield small disparity regions manifested as repeat content to avoid redundant attention. Besides, a two-stage operation consisting of the image-level warping and feature-level warping is introduced to mitigate the propagation distortion. Since both feature integration and feature propagation require exact guidance from prior disparity, we design a semantic-aware disparity estimator that leverages semantic cues to optimize disparity generation while ensuring that our network can perform semantic segmentation in an end-to-end solution. To validate the effectiveness of the proposed method, we present the first multi-scale baseline dataset for LF semantic segmentation. Compared to state-of-the-art methods, our LF-IENet++ achieves outstanding performance and shows high robustness under different disparity situations. Besides, our method obtains higher accuracy on wide-baseline cases, demonstrating the significance of introducing large disparity LF for semantic segmentation.
engineering, electrical & electronic
What problem does this paper attempt to address?