Discriminative Training of MQDF Classifier on Synthetic Chinese String Samples

Xia Chen,Tong-Hua Su,Tian-Wen Zhang,Yu Li
DOI: https://doi.org/10.1109/ccpr.2010.5659250
2010-01-01
Abstract:Reliable recognition of realistic Chinese handwriting is of overwhelming interests yet challenging. Among many factors, enough training samples and advanced learning method are critical to identify the underlying symbols of a string image. This paper presents an embedding training of MQDF classifier with the help of synthetic string samples within the segmentation-recognition integration framework. First, the fed string images are over-segmented into primitive segments. Then a separate MQDF classifier re-trained discriminatively on string samples is used to measure the confidence of segmentation hypothesis. The optimal path, including segmentation and recognition results, can be finally identified using the beam search technique. Merely using the natural string samples, there exist heavy problems of string sample shortage. To expand the training data, a perturbation model has been utilized for synthesizing string samples. Experiments are conducted on the standard subset of HIT-MW database. Both the embedding training method and the distortion model demonstrate appealing results.
What problem does this paper attempt to address?