A Dual-Stream Cross-Domain Integration Network for RGB-T Salient Object Detection

Xiaosheng Yu,Xiufei Cheng,Yixiu Liu,Zhigao Zheng
DOI: https://doi.org/10.1109/tce.2024.3502692
2024-01-01
IEEE Transactions on Consumer Electronics
Abstract:RGB-T salient object detection enhances the performance of detection in complex scenes by integrating RGB and thermal data, but effective fusion remains challenging. To this end, we propose a dual-stream cross-domain integration network (DSCDNet), which explores the effective integration of spatial and frequency domain features, demonstrating remarkable accuracy and stability. Specifically, in the bimodal integration stage, to deeply extract high-level fusion information from multiple modalities, we introduce the spatial domain bimodal integration module and the frequency domain bimodal integration module. This parallel integration strategy facilitates the deep integration of RGB and thermal image from multiple dimensions. After that, our proposed feature decomposition strategy decomposes the spatial fused features into two streams: region perception stream and detail-aware stream, which interpret spatial features from the perspectives of regional and detail understanding, making the model more flexible and efficient in complex scenes. Further, to maximize the complementary advantages between spatial and frequency domain, we design a cross domain feature alignment module to facilitate the interactive learning between the two domains, thereby providing the model with a more comprehensive and enriched representation perspective. Experimental results demonstrate that the proposed DSCDNet outperforms 11 state-of-the-art (SOTA) methods.
What problem does this paper attempt to address?