Rejection Algorithm for Mis-segmented Characters in Multilingual Document Recognition

ZG Chen,XQ Ding
DOI: https://doi.org/10.1109/icdar.2003.1227761
2003-01-01
Abstract:In OCR systems the character segmentation algorithmmay generate mis-segmented blocks. Feedbackinformation from character classifier is indispensable toachieve higher character segmentation accuracy. In thispaper a novel rejection algorithm is proposed to identifythese mis-segmented characters more accurately. First,based on confidence evaluation of distance-basedclassifiers, the usual generalized confidence mappingfunction is modified to fit this specific purpose. Second, anovel adaptive thresholding rejection rule is proposed,which is more accurate and flexible. Experiments onChinese, Japanese and Korean document recognitionshowed that new rejection algorithm evidently improvedthe system performance, especially for low-qualityprinted document recognition.
What problem does this paper attempt to address?