Abstract:Visual attention aims at selecting a salient subset from the visual input for further processing while ignoring redundant data. The dominant view for the computation of visual attention is based on the assumption that bottom-up visual saliency such as local contrast and interest points drives the allocation of attention in scene viewing. However, we advocate in this paper that the deployment of attention is primarily and directly guided by objects and thus propose a novel framework to explore image visual attention via the learning of object attributes from eye-tracking data. We mainly aim to solve three problems: (1) the pixel-level visual attention computation (the saliency map); (2) the image-level visual attention computation; (3) the application of the computation model in image categorization. We first adopt the algorithm of object bank to acquire the responses to a number of object detectors at each location in an image and thus form a feature descriptor to indicate the occurrences of various objects at a pixel or in an image. Next, we integrate the inference of interesting objects from fixations in eye-tracking data with the competition among surrounding objects to solve the first problem. We further propose a computational model to solve the second problem and estimate the interestingness of each image via the mapping between object attributes and the inter-observer visual congruency obtained from eye-tracking data. Finally, we apply the proposed pixel-level visual attention model to the image categorization task. Comprehensive evaluations on publicly available benchmarks and comparisons with state-of-the-art methods demonstrate the effectiveness of the proposed models.

Combined Segmentation And Visual Attention For Object Categorization And Video Semantic Concepts Detection

Using Segmentation And Visual Attention For Semantic Object Model

Visual Attention Based Video Object Segmentation in MPEG Compressed Domain

Interesting moving object segmentation based on selective visual attention and Markov random field

Learning Spatial-Semantic Features for Robust Video Object Segmentation

Exploring Part-Aware Segmentation for Fine-Grained Visual Categorization

A Cyclic Information–Interaction Model for Remote Sensing Image Segmentation

Accurate Object Segmentation for Video Sequences Via Temporal-Spatial-Frequency Saliency Model.

Dual Cross-Attention for Video Object Segmentation Via Uncertainty Refinement

Semantic Aware Attention Based Deep Object Co-segmentation

Modeling Objects with Local Descriptors of Biologically Motivated Selective Attention

Visual Attention Guided Video Object Segmentation

Image Visual Attention Computation and Application Via the Learning of Object Attributes

Bio-inspired Visual Attention Model and Saliency Guided Object Segmentation.

Shape-guided Segmentation for Fine-Grained Visual Categorization.

Learning to Segment Unseen Category Objects Using Gradient Gaussian Attention.

CAA : Channelized Axial Attention for Semantic Segmentation.

Channelized Axial Attention for Semantic Segmentation -- Considering Channel Relation within Spatial Attention for Semantic Segmentation

Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation

Channel and spatial attention based deep object co-segmentation

Object Segmentation from Consumer Videos: a Unified Framework Based on Visual Attention