Improving Offline Gurmukhi Character Recognition: A Comparative Study of Feature Selection Techniques
Kanta Prasad Sharma,Rashmi Agrawal,Nora Rashid Najem,Muhammad Irsyad Abdullah,Raman Kumar,Ahmed Alkhayyat,Devendra Singh
DOI: https://doi.org/10.1007/s40009-024-01532-y
2024-10-31
National Academy Science Letters
Abstract:In this study, we introduce and assess a novel feature extraction technique that analyzes the extent of character image boundaries to enhance recognition accuracy. This method is evaluated in conjunction with Nearest Neighbors (NN) and Support Vector Machine (SVM) classifiers, and compared against various feature selection methods including Consistency Based Analysis (CBA), Correlation Feature Set (CFS), Chi-Squared Attribute (CSA), Independent Component Analysis (ICA), Latent Semantic Analysis (LSA), Principal Component Analysis (PCA), and Random Projection (RP). Our extensive experiments demonstrate that CSA consistently outperforms the other techniques, achieving high recognition rates of 90.4%, 96.1%, and 92.9% for lower zone, middle zone, and upper zone characters, respectively, when using the NN classifier. The authors have selected the Nearest Neighbors (NN) and Support Vector Machine (SVM) classifiers based on their widely use and effectiveness in character recognition tasks. NN was chosen for its simplicity and intuitive approach, while SVM was selected for its robust performance in high-dimensional spaces and its ability to find an optimal decision boundary. The feature selection methods were chosen based on their established relevance and effectiveness in previous research. These results highlight CSA's superiority and suggest that expanding the dataset could further enhance recognition accuracy, contributing to the robustness of future OCR applications.
multidisciplinary sciences