Focal stack based light field salient object detection via 3D–2D convolution hybrid network
Xin Wang,Gaomin Xiong,Yong Zhang
DOI: https://doi.org/10.1007/s11760-023-02700-1
2023-08-17
Abstract:Due to the remarkable ability to capture both spatial and angular information of the scene, light field imaging provides abundant cues and information. Over the last decade, various forms of data, such as the focal stack, all-in-focus image, depth map, sub-aperture image, center-view image, and micro-lens image array, have been exploited by different methods of light field salient object detection (SOD). In this study, we introduce a novel 3D–2D convolution hybrid network called HFSNet, which utilizes the focal stack as the only input to achieve SOD. The encoder network is constructed based on 3D convolution to extract and preserve the continuously changing focus cues within the focal stack. In order to reduce the computational burden of 3D convolution, we incorporate 3D max-pooling layers, channel reduction modules, and focal stack feature fusing modules to reduce the data dimension. The decoder network, on the other hand, is built on 2D convolution to generate coarse saliency maps, which are then refined using the refine module to obtain the final saliency map. We conduct experiments on five benchmark light field SOD datasets, and the results demonstrate that our method outperforms other models on DUTLF-V2 and DUTLF-FS, and achieves competitive outcomes on Lytro Illum, HFUT-Lytro, and LFSD.
engineering, electrical & electronic,imaging science & photographic technology