A two stage recognition scheme for handwritten Tamil characters

Ujjwal Bhattacharya, SK Ghosh, S Parui
2007-09-23
Abstract:India is a multilingual multiscript country with more than 18 languages and 10 different major scripts. Not enough research work towards recognition of handwritten characters of these Indian scripts has been done. Tamil, an official as well as popular script of the southern part of India, Singapore, Malaysia, and Sri Lanka has a large character set which includes many compound characters. Only a few works towards handwriting recognition of this large character set has been reported in the literature. Recently, HP Labs India developed a database of handwritten Tamil characters. In the present paper, we describe an off-line recognition approach based on this database. The proposed method consists of two stages. In the first stage, we apply an unsupervised clustering method to create a smaller number of groups of handwritten Tamil character classes. In the second stage, we consider a supervised classification …
What problem does this paper attempt to address?