Cross-modal and multi-level feature refinement network for RGB-D salient object detection

Yue Gao,Meng Dai,Qing Zhang
DOI: https://doi.org/10.1007/s00371-022-02543-w
IF: 2.835
2022-06-30
The Visual Computer
Abstract:RGB-D salient object detection (SOD) methods adopt depth maps as important supplementary information in order to identify salient objects more accurately. However, there are still two main challenges in the existing RGB-D SOD methods. One typical issue is how to obtain effective cross-modal features, and another issue is how to optimize the integration of multi-level features. To tackle these two issues, we propose a novel cross-modal and multi-level feature refinement network which equips with a cross-modal feature interaction module and a multi-level feature fusion module. Specifically, a cross-modal feature interaction module is designed to enhance depth features from both channel and spatial perspectives and then effectively integrate cross-modal features. Moreover, considering the characteristics of different levels of features, we propose a multi-level feature fusion module which combines contextual information from multi-level features by means of skip connection. Extensive experiments on five benchmark datasets demonstrate that our proposed model outperforms other 17 state-of-the-art RGB-D SOD methods.
computer science, software engineering
What problem does this paper attempt to address?