Joint stroke classification and text line grouping in online handwritten documents with edge pooling attention networks

Jun-Yu Ye,Yan-Ming Zhang,Qing Yang,Cheng-Lin Liu
DOI: https://doi.org/10.1016/j.patcog.2021.107859
IF: 8
2021-06-01
Pattern Recognition
Abstract:<p>Stroke classification and text line grouping are important tasks in online handwritten document segmentation. In the past, the two tasks were usually performed using different models which are trained independently and perform sequentially. This cannot optimize the integration of contextual information and the system may suffer from error accumulation in stroke classification. In this paper, we propose a method for joint text/non-text stroke classification and text line grouping in online handwritten documents using attention based graph neural network. In our framework, the stroke classification and text line grouping problems are formulated as node classification and node clustering problems in a relational graph, which is constructed based on the temporal and spatial relationship between strokes. We propose a new graph network architecture, called <em>edge pooling attention network (EPAT)</em> to efficiently aggregate information between the features of neighboring nodes and edges. The proposed model is trained by multi-task learning with cross entropy loss for node classification and distance metric loss for node clustering. In experiments on two online handwritten document datasets IAMOnDo and Kondate, the proposed method is demonstrated effective, yielding superior performance in both stroke classification and text line grouping.</p>
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?