High-Resolution Image Classification with Rich Text Information Based on Graph Convolution Neural Network

Siyi Han,Jing Zhou,Xuening Zhu,Zhe Li,Jie Liu,Hansheng Wang,Yibing Gong
DOI: https://doi.org/10.2139/ssrn.4155316
2022-01-01
Abstract:High-resolution image classification with rich text information is becoming increasingly common in real life. For example, in a live stream, viewers may receive a high-resolution image with rich text that contains product information sent by a commercial broadcaster. Under this circumstance, image classification is more challenging because it must take both high resolution and text information into consideration. In this study, we propose a novel graph neural network-based method for solving the problem of high-resolution image classification with rich text information. Specifically, the proposed method consists of the following three steps. First, we adopt an optical character recognition model to extractrich text information from a target image. Then, using location information, we construct a graph to represent the relationship between the text in the image. Third, a graph neural network is used to capture the semantic features of the original image. To validate the proposed methodology, we design two models: a node-based model and graph-based model. Experimental evaluations show that both models take better advantage of high resolution and text content and outperform other conventional image classification methods based on convolutional neural networks. Finally, to promote more efficient academic communication, the code and dataset are publicly available at https://github.com/kann-tsukasa/High-Resolution-Image-Classification-with-Rich-Text-Information.
What problem does this paper attempt to address?