Gated multi-modal edge refinement network for light field salient object detection

Yefan Li,Fuqing Duan,Ke Lu
DOI: https://doi.org/10.1145/3674836
2024-06-28
Abstract:Light field can be decoded into multiple representations, and provides valuable focus and depth information. This breakthrough overcomes the limitations of traditional 2D and 3D saliency detection methods, opening up new possibilities for more accurate and comprehensive analysis of visual scenes. To tackle the challenges of inaccurate edge prediction and effectively leverage the rich multi-modal light field information, we propose a gated multi-modal edge refinement network (GMERNet). It first obtains the preliminary position and structure information of the salient object, and then gradually refines the object edge. This involves two modules: gated multi-modal feature complement (GMFC) module and progressive edge refinement (PER) module. The GMFC module captures dependencies across the all-in-focus image and its corresponding focal stack and depth map, effectively aggregating multiple features through gate mechanisms. The PER module progressively refines edges by combining salient object features with edge features through a cascaded structure. Experimental results demonstrate that GMERNet achieves state-of-the-art performance on five benchmark datasets and shows significant advantages in extracting salient objects with complex edges.
computer science, information systems, theory & methods, software engineering
What problem does this paper attempt to address?