Abstract:This research paper delves into the development of an Optical Character Recognition (OCR) system for the recognition of Ashokan Brahmi characters using Convolutional Neural Networks. It utilizes a comprehensive dataset of character images to train the models, along with data augmentation techniques to optimize the training process. Furthermore, the paper incorporates image preprocessing to remove noise, as well as image segmentation to facilitate line and character segmentation. The study mainly focuses on three pre-trained CNNs, namely LeNet, VGG-16, and MobileNet and compares their accuracy. Transfer learning was employed to adapt the pre-trained models to the Ashokan Brahmi character dataset. The findings reveal that MobileNet outperforms the other two models in terms of accuracy, achieving a validation accuracy of 95.94% and validation loss of 0.129. The paper provides an in-depth analysis of the implementation process using MobileNet and discusses the implications of the findings. The use of OCR for character recognition is of significant importance in the field of epigraphy, specifically for the preservation and digitization of ancient scripts. The results of this research paper demonstrate the effectiveness of using pre-trained CNNs for the recognition of Ashokan Brahmi characters.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to develop an Optical Character Recognition (OCR) system to recognize Brahmi script from the Ashoka period. Specifically, the research aims to use Convolutional Neural Networks (CNN) to process and recognize these ancient characters, thereby achieving the digitization and preservation of ancient inscriptions. ### Problem Background The Brahmi script is an ancient pan - Indian script that emerged in the 3rd century BC and is considered the origin of many modern Indian scripts. However, with the passage of time, people gradually forgot how to write and read this ancient script and it was not rediscovered until the 19th century. In order to protect and study these precious historical documents, researchers need an effective method to convert the characters in the inscriptions into a readable digital format. Traditional OCR techniques are mainly used for modern popular scripts and have made little progress in the study of historical or extinct scripts, mainly due to the lack of sufficient training data. ### Research Objectives 1. **Develop OCR System**: Use Convolutional Neural Networks (CNN) to build an OCR system capable of recognizing Ashoka Brahmi characters. 2. **Optimize Training Process**: Increase the quantity and diversity of training data through data augmentation techniques to improve the robustness and generalization ability of the model. 3. **Image Pre - processing and Segmentation**: Remove noise and perform image segmentation to ensure the image quality input to the model. 4. **Model Selection and Evaluation**: Compare the performance of three pre - trained CNN models (LeNet, VGG - 16 and MobileNet) on the Ashoka Brahmi character recognition task and select the optimal model. ### Main Contributions - **Dataset Creation**: Manually created a dataset containing 3,500 high - quality Ashoka Brahmi character images and expanded it to approximately 227,000 images through data augmentation techniques. - **Image Pre - processing**: Employ techniques such as median blurring and Otsu thresholding to remove noise and improve the accuracy of character recognition. - **Image Segmentation**: Use the projection profile method for line segmentation and character segmentation to extract individual characters more precisely. - **Model Performance**: Verified through experiments, the MobileNet model performs best under average pooling, achieving a validation accuracy of 95.94% and a validation loss of 0.129. ### Conclusion This research has successfully developed an efficient OCR system that can be used to recognize Ashoka Brahmi characters, significantly improving the digitization and preservation capabilities of ancient inscriptions. Future work will continue to improve the OCR system, integrate more advanced deep - learning techniques, and expand the dataset to cover more types of ancient scripts.

Optical Character Recognition using Convolutional Neural Networks for Ashokan Brahmi Inscriptions

Handwritten Text Recognition Using Convolutional Neural Network

OCR using CRNN: A Deep Learning Approach for Text Recognition

Handwritten Vedic Sanskrit Text Recognition Using Deep Learning and Convolutional Neural Networks

CNN-Bidirectional LSTM Based Optical Character Recognition of Sanskrit Manuscripts : A Comprehensive Systematic Literature Review

DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition

A Deep Learning-Based Pre-Trained VGG19 Model for Optical Character Recognition

A Novel Approach to OCR using Image Recognition based Classification for Ancient Tamil Inscriptions in Temples

Implementation of OCR using Convolutional Neural Network (CNN): A Survey

Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN)

Handwritten optical character recognition using TransRNN trained with self improved flower pollination algorithm (SI-FPA)

Handwritten OCR for Indic Scripts: A Comprehensive Overview of Machine Learning and Deep Learning Techniques

End-to-End Optical Character Recognition for Bengali Handwritten Words

A semi-self-supervised learning model to recognize handwritten characters in ancient documents in Indian scripts

Image Based Character Recognition, Documentation System To Decode Inscription From Temple

Convolutional-Neural-Network-Based Handwritten Character Recognition: An Approach with Massive Multisource Data

Cross Lingual Handwritten Character Recognition Using Long Short Term Memory Network with aid of Elephant Herding Optimization Algorithm

A recurrent neural network based deep learning model for text and non-text stroke classification in online handwritten Devanagari document

Bangla-Meitei Mayek scripts handwritten character recognition using Convolutional Neural Network

Manuscripts Character Recognition Using Machine Learning and Deep Learning

Optical Character Recognition System for Digit Recognition Using Deep Learning