The Recognition of Chinese Caption Text in News Video Using Convolutional Neural Network

Dixiu Zhong,Ping Shi,Da Pan,Yuan Sha
DOI: https://doi.org/10.1109/imcec.2016.7867291
2016-01-01
Abstract:News video caption, which carries main contents of related news story, plays an important role in content-based video analysis and retrieval system. In this paper, the convolutional neural network (CNN) is used to the recognition of chinese caption text in news video. First, the color and edge feature are used for caption location. Then, the segmentation combined Otsu and K-means clustering algorithm is applied to the caption images before they are sent to CNN. It is worth mentioning that we present a method for generating and labeling training images automatically, which avoids the complex and time consuming data collection. Finally, two CNN models trained on different dataset are evaluated in our experiment. By using the baseline model, the recognition accuracy can achieve 93.3% in top-1 and 98.58% in top-5 on chinese caption texts collected from news video. We also show an improvement to 95% in top-1 accuracy by averaging the two CNN models. Experimental results suggest that CNN is competent to the challenging task of chinese character recognition.
What problem does this paper attempt to address?