Novel Geometrical Shape Feature Extraction Techniques for Multilingual Character Recognition

Narasimha Reddy Soora,Parag S. Deshpande
DOI: https://doi.org/10.1080/02564602.2016.1229583
2016-10-21
IETE Technical Review
Abstract:Multilingual character recognition from the images of aged Indian documents is challenging because of the complex character grapheme of the Indian language scripts. Feature extraction plays the most important role in recognition of such images. In this paper, we have proposed a set of feature vectors (FVs) which are based on shape geometry (SG) decoding of the input character. The first FV is based on SG decoding of the input character using triangular area (TA) calculation. The second FV, namely, SG using perpendicular distance is extracted by dividing the input image into individual components and the shape of the individual component is decoded into shape symbols by comparing the normalized perpendicular distances of the individual pixels of the component onto the line joining the end points of the component. Apart from the proposed FVs, we have used crossing count features. These FVs are represented as the string of shape operators; hence, we have used minimum edit distance classifier to recognize the input character. The proposed character recognition technique is evaluated using the characters extracted from printed aged multilingual Indian documents having English, Devanagari, and Marathi scripts and achieved encouraging results. To further assess the performance of the proposed system, we have considered publicly available media-lab license plate benchmark database and achieved significant performance.
telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?