Impact of Pre-Processing on Recognition of Cursive Video Text

Ali Mirza,Imran Siddiqi,Syed Ghulam Mustufa,Mazahir Hussain
DOI: https://doi.org/10.1007/978-3-030-31332-6_49
2019-01-01
Pattern Recognition and Image Analysis
Abstract:Recognition of text appearing in videos offers a number of interesting applications including retrieval systems, generation of user alerts on keywords and news summarization systems. Thanks to the recent advancements in deep learning, high text recognition rates have been reported in the recent years. An important step in training such systems is the pre-processing of images for effective feature learning and classification. This study investigates the impact of pre-processing on recognition of cursive video text using Urdu as a case study. The recognition engine relies on a combination of convolutional and long short-term memory networks followed by a connectionist temporal classification layer for sequence alignment. The system is fed with gray scale text line images directly as well as by segmenting the text from background using various thresholding techniques. Experimental study on a dataset of 12,000 text lines in cursive Urdu text reveals that appropriately pre-processing the text line images significantly improves the recognition rates.
What problem does this paper attempt to address?