Multimodal graph convolutional networks for high quality content recognition

Jinguang Wang,Jun Hu,Shengsheng Qian,Quan Fang,Changsheng Xu
DOI: https://doi.org/10.1016/j.neucom.2020.04.145
IF: 6
2020-10-01
Neurocomputing
Abstract:<p>With the development of the Internet, more and more creators publish articles on social media. How to automatically filter high quality content from a large number of multimedia articles is one of the core functions of information recommendation, search engine, and other systems. However, existing approaches typically suffer from two limitations: (1) They usually model content as word sequences, which ignores the semantics provided by non-consecutive phrases, long-distance word dependency, and visual information. (2) They rely on a large amount of manually annotated data to train a quality assessment model while users may only provide labels of interest in a single class for a small number of samples in reality. To address these limitations, we propose a <em>Multimodal Graph Convolutional Networks</em> (MGCN) to model the semantic representations in a unified framework for High Quality Content Recognition. Instead of viewing text content as word sequences, we convert them into graphs, which can model non-consecutive phrases and long-distance word dependency for better obtaining the composition of semantics. Besides, visual content is also modeled into the graphs to provide complementary semantics. A well-designed graph convolutional network is proposed to capture the semantic representations based on these graphs. Furthermore, we employ a non-negative risk estimator for high quality content recognition and the loss is back-propagated for model learning. Experiments on real datasets validate the effectiveness of our approach.</p>
computer science, artificial intelligence
What problem does this paper attempt to address?