Exploring OCR for Historical Document Preservation (Indus Script)
Gunjan Kothari,Bobby Jatav,Om Bhimani,Prof.Sunita Bangal,
DOI: https://doi.org/10.55041/ijsrem25807
2023-09-01
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
Abstract:This research paper explores the critical role of Optical Character Recognition (OCR) in the preservation and analysis of historical documents, focusing on the intriguing case of the Indus Valley Civilization's script. The Indus Valley Civilization, thriving from 2600 to 1900 BCE, left artifacts in the form of intricately carved Indus seals adorned with mysterious symbols. The writing on these seals, known as the Indus script, remains a puzzle yet to be solved, presenting a challenge for researchers to uncover the secrets of this ancient civilization. The application of OCR technology shows promises as a systematic approach to analyse and digitize these enigmatic symbols. However, the unique complexities of the Indus script, such as its undeciphered nature, lack of reference points, syntax and grammar variations, and differences in carving styles, pose significant obstacles. This project holds immense importance on two fronts. Firstly, it enables the creation of a dataset containing symbols from Indus seals, providing a valuable resource for data scientists to develop OCR algorithms and advance research in this field. Secondly, it necessitates the development of OCR algorithms specifically tailored for deciphering the Indus script, pushing the boundaries of pattern recognition and natural language processing techniques. Once digitized, the script opens up possibilities for text mining and linguistic analysis. By studying patterns and relationships between symbols and linguistic features within the script, insights into events and cultural aspects can be gained, potentially establishing connections with known linguistic families. Moreover, this project encourages collaboration across various fields, including data science, archaeology, linguistics, and history. This interdisciplinary collaboration fosters problem-solving and a comprehensive understanding of the subject matter. Additionally, this project provides outreach opportunities to showcase the impact of data science in deciphering ancient writings while promoting the preservation and research of our rich cultural heritage. The study emphasizes the potential for OCR to unlock historical mysteries and highlights the interdisciplinary efforts required to advance the field of historical document analysis and preservation.