CSA-Net: Deep Cross-Complementary Self Attention and Modality-Specific Preservation for Saliency Detection

Surya Kant Singh,Rajeev Srivastava
DOI: https://doi.org/10.1007/s11063-022-10875-w
IF: 2.565
2022-11-01
Neural Processing Letters
Abstract:The multi-modality or multi-stream-based convolution neural network is the recent trend in saliency computation, which is receiving tremendous research interest. The previous models used modality-based independent fusion or cross-modality-based complementary fusion to find saliency that leads to incurring inconsistency or distribution loss of salient points and regions. Most existing models did not effectively utilize accurate localization of high-level semantic and contextual features. The proposed model collectively uses the above two methods and a precise deep localization model to target the abovementioned challenges. Specifically, CSA-Net comprises four essential features: non-complementarity, cross-complementary, intra-complementary, and deep localized improved high-level features. The designed encoder and decoder streams produce these essential features and assure modality-specific saliency preservation. The cross and intra- complementary fusion are deeply guided by proposed novel, cross-complementary self-attention to produce fused saliency. The attention map is computed by two-stage additive fusion based on a Non-Local network. A novel, Optimal Selective Saliency, has been proposed to find two similar saliencies among three steam-wise saliencies. The experimental analysis demonstrates the effectiveness of the proposed stream network and attention map. The experimental results show better performance in comparison with fourteen closely related state-of-the-art methods.
computer science, artificial intelligence
What problem does this paper attempt to address?