Community-Aware Photo Quality Evaluation by Deeply Encoding Human Perception
Luming Zhang,Yongheng Shang,Ping Li,Hao Luo,Ling Shao
DOI: https://doi.org/10.1109/TCYB.2019.2937319
IF: 11.8
2022-01-01
IEEE Transactions on Cybernetics
Abstract:Computational photo quality evaluation is a useful technique in many tasks of computer vision and graphics, for example, photo retaregeting, 3-D rendering, and fashion recommendation. The conventional photo quality models are designed by characterizing the pictures from all communities (e.g., “architecture” and “colorful”) indiscriminately, wherein community-specific features are not exploited explicitly. In this article, we develop a new community-aware photo quality evaluation framework. It uncovers the latent community-specific topics by a regularized latent topic model (LTM) and captures human visual quality perception by exploring multiple attributes. More specifically, given massive-scale online photographs from multiple communities, a novel ranking algorithm is proposed to measure the visual/semantic attractiveness of regions inside each photograph. Meanwhile, three attributes, namely: 1) photo quality scores; weak semantic tags; and inter-region correlations, are seamlessly and collaboratively incorporated during ranking. Subsequently, we construct the gaze shifting path (GSP) for each photograph by sequentially linking the top-ranking regions from each photograph, and an aggregation-based CNN calculates the deep representation for each GSP. Based on this, an LTM is proposed to model the GSP distribution from multiple communities in the latent space. To mitigate the overfitting problem caused by communities with very few photographs, a regularizer is incorporated into our LTM. Finally, given a test photograph, we obtain its deep GSP representation and its quality score is determined by the posterior probability of the regularized LTM. Comparative studies on four image sets have shown the competitiveness of our method. Besides, the eye-tracking experiments have demonstrated that our ranking-based GSPs are highly consistent with real human gaze movements.