Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction

Alloy Das,Shivakumara Palaiahnakote,Ayan Banerjee,Apostolos Antonacopoulos,Umapada Pal
DOI: https://doi.org/10.1016/j.knosys.2024.112593
IF: 8.139
2024-10-28
Knowledge-Based Systems
Abstract:The presence of unpredictable occlusions on natural scene text is a significant challenge, exacerbating the difficulties already posed on text detection and recognition by the variability of such images. Addressing the need for a robust, consistently performing approach that can effectively address the above challenges, this paper presents a new Soft Set-based end-to-end system for text detection, recognition and prediction in occluded natural scene images. This is the first approach to integrate text detection, recognition and prediction , unlike existing systems developed for end-to-end text spotting (text detection and recognition) only. For candidate text components detection, the proposed combination of Soft Sets with Maximally Stable Extremal Regions (SS-MSER) improves text detection and spotting in natural scene images, irrespectively of the presence of arbitrarily orientated and shaped text, complex backgrounds and occlusion. Furthermore, a Graph Recurrent Neural Network is proposed for grouping candidate text components into text lines and for fitting accurate bounding boxes to each word. Finally, a Convolutional Recurrent Neural Network (CRNN) is proposed for the recognition of text and for predicting missing characters due to occlusion. Experimental results on a new occluded scene text dataset (OSTD) and on the most relevant benchmark natural scene text datasets demonstrate that the proposed system outperforms the state-of-the-art in text detection, recognition and prediction. The code and dataset are available at https://github.com/alloydas/Softset-MSER-Based-Occluded-Scene-Text-Spotting/blob/master/Soft_set_MSER.ipynb
computer science, artificial intelligence
What problem does this paper attempt to address?