End-to-end attention convolutional recurrent network for online handwritten Chinese text recognition
Xiwen Qu,Zhihong Wu,Jun Huang
DOI: https://doi.org/10.1007/s11042-023-17987-y
IF: 2.577
2024-01-07
Multimedia Tools and Applications
Abstract:Online handwritten Chinese text recognition (OHCTR) has been a challenging problem due to the large character set, diverse writing styles and variable text line length. The existing convolutional recurrent network (CNN) architectures have achieved greatly success in OHCTR, but they need to convert chronological sequence coordinates into image-like representations or vectors, which will lead to information loss and increase time consumption. To avoid the conversion process, we propose a novel end-to-end attention convolutional recurrent network (EACRN) for OHCTR in this paper. Specifically, The EACRN directly extract local contextual features from raw chronological sequence coordinates using end-to-end CNN. After that, bidirectional long short-term memory (BiLSTM) is employed to capture long-term dependencies of local contextual features. Then multi-head attention is utilized to weight local contextual features. Finally, focal connectionist temporal classification (CTC) objective function is introduced into OHCTR by us to increase attention to low-frequency characters in text and makes predictions. Experiments on two publicly datasets, standard benchmarks dataset CASIA-OLHWDB2.0-2.2 and in-air handwritten Chinese text dataset IAHCT-UCAS2018, demonstrate that our method obtains higher recognition accuracy with faster computation speed and more compact model compared with previous CNN architectures.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering