Abstract:Computational photo quality evaluation is a useful technique in many tasks of computer vision and graphics, for example, photo retaregeting, 3-D rendering, and fashion recommendation. The conventional photo quality models are designed by characterizing the pictures from all communities (e.g., “architecture” and “colorful”) indiscriminately, wherein community-specific features are not exploited explicitly. In this article, we develop a new community-aware photo quality evaluation framework. It uncovers the latent community-specific topics by a regularized latent topic model (LTM) and captures human visual quality perception by exploring multiple attributes. More specifically, given massive-scale online photographs from multiple communities, a novel ranking algorithm is proposed to measure the visual/semantic attractiveness of regions inside each photograph. Meanwhile, three attributes, namely: 1) photo quality scores; weak semantic tags; and inter-region correlations, are seamlessly and collaboratively incorporated during ranking. Subsequently, we construct the gaze shifting path (GSP) for each photograph by sequentially linking the top-ranking regions from each photograph, and an aggregation-based CNN calculates the deep representation for each GSP. Based on this, an LTM is proposed to model the GSP distribution from multiple communities in the latent space. To mitigate the overfitting problem caused by communities with very few photographs, a regularizer is incorporated into our LTM. Finally, given a test photograph, we obtain its deep GSP representation and its quality score is determined by the posterior probability of the regularized LTM. Comparative studies on four image sets have shown the competitiveness of our method. Besides, the eye-tracking experiments have demonstrated that our ranking-based GSPs are highly consistent with real human gaze movements.

Natural scene recognition using weighted histograms of gradient orientation descriptor

Scene Recognition Combining Structural and Textural Features

Scene Classification Using Multi-Resolution Low-Level Feature Combination

The Bag-of-visual-words Scene Classifier Combining Local and Global Features for High Spatial Resolution Imagery.

Boosting Classifiers for Scene Category Recognition.

Scene classification using a multi-resolution bag-of-features model

Natural Scene Category Recognition Based on Multiple Channels of PHOW

Bag-of-Visual-Words Scene Classifier with Local and Global Features for High Spatial Resolution Remote Sensing Imagery

Natural Scene Character Recognition Using Robust PCA and Sparse Representation

Community-Aware Photo Quality Evaluation by Deeply Encoding Human Perception

Self-Selection Salient Region-Based Scene Recognition Using Slight-Weight Convolutional Neural Network

Character recognition in natural scene images using local description

Scene Categorization by Deeply Learning Gaze Behavior in a Semisupervised Context

Local Salient Regions Based Natural Scene Recognition

Bag of Spatial Visual Words Model for Scene Classification

Natural Scene Recognition Based on Superpixels and Deep Boltzmann Machines

Evaluating Bag-of-visual-words Representations in Scene Classification

Learning Discriminative Visual Dictionary for Natural Scene Categorization

Natural Scene Digit Classification Using Convolutional Neural Networks

Scene classification using a hybrid generative/discriminative approach

Feature significance-based multibag-of-visual-words model for remote sensing image scene classification