Tamil OCR Conversion from Digital Writing Pad Recognition Accuracy Improves through Modified Deep Learning Architectures
V. Jayanthi,S. Thenmalar
DOI: https://doi.org/10.1155/2023/6897719
IF: 2.336
2023-08-06
Journal of Sensors
Abstract:Digital handwritten recognition is an emerging field in optical character recognition (OCR). A digital writing pad replaces manual writing. In digital writing, the alphabet changes in font and shape. During OCR recognition, covert text file errors occur due to digital pen pressure and digital pen position on the digital pad by the writer. The shape changes in the alphabet lead to an error during the conversion of OCR to text. The above problem arises in Tamil, Chinese, Arabic, and Telugu, where the alphabet consists of bends, curves, and rings. OCR-to-text conversion for the Tamil language has more word errors due to angles and curves in the alphabet, which need to be converted accurately. This paper proposes ResNet two-stage bottleneck architecture (RTSBA) for Tamil language-based text recognition written on a digital writing pad. In the proposed RTSBA, two separate stages of neural networks reduce the complexity of the Tamil alphabet recognition problem. In the initial stage, the number of inputs and variables is reduced. In the final stage, time and computation complexity are reduced. The proposed algorithm has been compared with traditional algorithms such as long short-term memory, Inception-v3, recurrent neural networks, convolutional neural networks, and a two-channel and two-stream transformer. Proposed methods, such as RTSBA applied in the digital writing pad-handwritten and HP lab datasets, achieved an accuracy of 98.7% and 97.1%, respectively.
engineering, electrical & electronic,instruments & instrumentation