RGB-D Salient Object Detection Based on Cross-Modal and Cross-Level Feature Fusion

Yanbin Peng,Zhinian Zhai,Mingkun Feng
DOI: https://doi.org/10.1109/access.2024.3381524
IF: 3.9
2024-04-02
IEEE Access
Abstract:Existing RGB-D saliency detection models have not fully considered the differences between features at various levels, and lack an effective mechanism for cross-level feature fusion. This article proposes a novel cross-modality cross-level fusion learning framework. The framework mainly contains three modules: Attention Enhancement Module (AEM), Modality Feature Fusion Module (MFM), and Graph Reasoning Module (GRM). AEM is used to enhance the features of the two modalities. MFM is used to integrate the features of the two modalities to achieve cross-modality feature fusion. Subsequently, the modality fusion features are divided into high-level features and low-level features. The high-level features contain the semantic localization information of salient objects, and the low-level features contain the detailed information of salient objects. GRM extends the semantic localization information of salient objects in the high-level features from pixel features to the entire salient object area, thereby achieving cross-level feature fusion. This framework can effectively eliminate background noise and enhance the model's expressiveness. Extensive experiments were conducted on seven widely used datasets, and the results show that the new method outperforms nine current state-of-the-art RGB-D SOD methods.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?