Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network

El Bahi, Hassan
DOI: https://doi.org/10.1007/s00500-024-09930-6
IF: 3.732
2024-09-13
Soft Computing
Abstract:Digitizing ancient manuscripts and making them accessible to a broader audience is a crucial step in unlocking the wealth of information they hold. However, automatic recognition of handwritten text and the extraction of relevant information such as named entities from these manuscripts are among the most difficult research topics, due to several factors such as poor quality of manuscripts, complex background, presence of ink stains, cursive handwriting, etc. To meet these challenges, we propose two systems, the first system performs the task of handwritten text recognition (HTR) in ancient manuscripts; it starts with a preprocessing operation. Then, a convolutional neural network (CNN) is used to extract the features of each input image. Finally, a recurrent neural network (RNN) which has Long Short-Term Memory (LSTM) blocks with the Connectionist Temporal Classification (CTC) layer will predict the text contained in the image. The second system focuses on recognizing named entities and deciphering the relationships among words directly from images of old manuscripts, bypassing the need for an intermediate text transcription step. Like the previous system, this second system starts with a preprocessing step. Then the data augmentation technique is used to increase the training dataset. After that, the extraction of the most relevant features is done automatically using a CNN model. Finally, the recognition of names entities and the relationship between word images is performed using a bidirectional LSTM. Extensive experiments on the ESPOSALLES dataset demonstrate that the proposed systems achieve the state-of-the-art performance exceeding existing systems.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?