Flickr Image Community Analytics by Deep Noise-Refined Matrix Factorization.
Luming Zhang,Jianwei Yin,Ping Li,Yongheng Shang,Roger Zimmermann,Ling Shao
DOI: https://doi.org/10.1109/tmm.2019.2938664
IF: 7.3
2019-01-01
IEEE Transactions on Multimedia
Abstract:Accurately categorizing Flickr images into multiple pre-defined communities (e.g., "architecture" and "peaceful") is an indispensable technique in multimedia analysis, graphic design, fashion recommendation, etc. In practice, these communities are constructed and updated manually, which is subjective and intolerably time consuming. To alleviate these shortcomings, a noise-refined deep matrix factorization (MF) framework is proposed to intelligently discover communities from million-scale Flickr users, wherein the semantic tag correlations and community correlations are simultaneously encoded. More specifically, it is believable that Flickr communities are high-level clues on the basis of human visual semantic perception. Thereby, a MF algorithm is employed to approximate the community label matrix by the product of pairwise factor matrices, which represent the latent representations of user-provided tags and the corresponding basis matrix respectively. Subsequently, an end-to-end deep model is formulated to hierarchically derive the latent deep representation from raw image pixels to semantic tags. To robustly handle contaminated image semantic tags and community labels, an $l_1$ norm constraint is encoded to enhance the MF. Meanwhile, to optimally exploit the rich context information of Flickr images, the intrinsic structure between image semantic tags and between communities are collaboratively captured. Finally, the upgraded MF and the deep model are seamlessly combined into a unified framework, which is solved by an iterative algorithm. Experiments on 2 M Flickr images have demonstrated the superiority of our approach. Besides, the discovered Flickr communities can improve photo retargeting and visual aesthetics assessment significantly.