Abstract:Despite significant progress, image saliency detection still remains a challenging task in complex scenes and environments. Integrating multiple different but complementary cues, like RGB and Thermal (RGB-T), may be an effective way for boosting saliency detection performance. The current research in this direction, however, is limited by the lack of a comprehensive benchmark. This work contributes such a RGB-T image dataset, which includes 821 spatially aligned RGB-T image pairs and their ground truth annotations for saliency detection purpose. The image pairs are with high diversity recorded under different scenes and environmental conditions, and we annotate 11 challenges on these image pairs for performing the challenge-sensitive analysis for different saliency detection algorithms. We also implement 3 kinds of baseline methods with different modality inputs to provide a comprehensive comparison platform. With this benchmark, we propose a novel approach, multi-task manifold ranking with cross-modality consistency, for RGB-T saliency detection. In particular, we introduce a weight for each modality to describe the reliability, and integrate them into the graph-based manifold ranking algorithm to achieve adaptive fusion of different source data. Moreover, we incorporate the cross-modality consistent constraints to integrate different modalities collaboratively. For the optimization, we design an efficient algorithm to iteratively solve several subproblems with closed-form solutions. Extensive experiments against other baseline methods on the newly created benchmark demonstrate the effectiveness of the proposed approach, and we also provide basic insights and potential future research directions for RGB-T saliency detection.

Multi-Task Rank Learning for Visual Saliency Estimation

Probabilistic Multi-Task Learning for Visual Saliency Estimation in Video

Removing Label Ambiguity in Learning-Based Visual Saliency Estimation.

A Unified RGB-T Saliency Detection Benchmark: Dataset, Baselines, Analysis and A Novel Approach

Inferring Attention Shifts for Salient Instance Ranking

Cost-Sensitive Rank Learning From Positive and Unlabeled Data for Visual Saliency Estimation

Multi-Task Joint Learning of 3D Keypoint Saliency and Correspondence Estimation

Salient object detection with low-rank approximation and ℓ2,1-norm minimization

A joint similarity matrix learning of multi-view data for RGB-T saliency detection

Instance-Level Relative Saliency Ranking with Graph Reasoning

Learning Discriminative Subspaces on Random Contrasts for Image Saliency Analysis

Learning to Model Task-Oriented Attention

Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model

Bi-directional Object-Context Prioritization Learning for Saliency Ranking

Multi-Camera Saliency.

Joint Learning of Visual-Audio Saliency Prediction and Sound Source Localization on Multi-face Videos

Visual Saliency Detection Via Rank-Sparsity Decomposition

Saliency-based Sequential Image Attention with Multiset Prediction

Manifold Ranking-Based Matrix Factorization for Saliency Detection

Multi-Stream Refining Network for Person Re-Identification

Saliency Detection Using Two-Stage Scoring