Characters as Graphs: Interpretable Handwritten Chinese Character Recognition via Pyramid Graph Transformer
Ji Gan,Yuyan Chen,Bo Hu,Jiaxu Leng,Weiqiang Wang,Xinbo Gao
DOI: https://doi.org/10.1016/j.patcog.2023.109317
IF: 8
2023-01-12
Pattern Recognition
Abstract:It is meaningful but challenging to teach machines to recognize handwritten Chinese characters. However, conventional approaches typically view handwritten Chinese characters as either static images or temporal trajectories, which may ignore the inherent geometric semantics of characters. Instead, here we first propose to represent handwritten characters as skeleton graphs, explicitly considering the natural characteristics of characters (i.e., characters as graphs). Furthermore, we propose a novel Pyramid Graph Transformer (PyGT) to specifically process the graph-structured characters, which fully integrates the advantages of Transformers and graph convolutional networks. Specifically, our PyGT can learn better graph features through (i) capturing the global information from all nodes with graph attention mechanism and (ii) modelling the explicit local adjacency structures of nodes with graph convolutions. Furthermore, the PyGT learns the multi-resolution features by constructing a progressive shrinking pyramid. Compared with existing approaches, it is more interpretable to recognize characters as geometric graphs. Moreover, the proposed method is generic for both online and offline handwritten Chinese character recognition (HCCR), and it also can be feasibly extended to handwritten text recognition. Extensive experiments empirically demonstrate the superiority of PyGT over the prevalent approaches including 2D-CNN, RNN/1D-CNN, and Vision Transformer (ViT) for HCCR. The code is available at https://github.com/ganji15/PyGT-HCCR.
computer science, artificial intelligence,engineering, electrical & electronic