A proposal for touching component segmentation in Arabic manuscripts

Nabil Aouadi,Afef Kacem
DOI: https://doi.org/10.1007/s10044-016-0543-1
IF: 2.307
2016-03-15
Pattern Analysis and Applications
Abstract:Text-line segmentation is one of the key factors which affect the performance of handwriting recognition system. Therefore, to make recognition systems more effective and accurate, segmentation of touching text-lines is an important task. One of the problems making this task crucial is the presence of touching components (TCs) representing connections between word letters of consecutive text-lines or those of words of the same text-line. The proposed method aims to segment TCs. It is mainly based on two steps: (1) finding for a localized TC a similar model, stored in a dictionary with its correct segmentation, using shape context descriptor and an interpolation function: the thin plate spline transformation, (2) segmenting the TC based on central point of the found similar model parts. TCs are assumed to be already extracted from Arabic manuscript images. Experiments are carried on a common TC database, using two metrics: Manhattan and Euclidean distances. Obtained results outperform the state of the art, considering the different types, variability and complexity of the TCs data set, and show the effectiveness of the proposed TC segmentation method.
computer science, artificial intelligence
What problem does this paper attempt to address?