New Gradient-Spatial-Structural Features for Video Script Identification

Palaiahnakote Shivakumara,Zehuan Yuan,Danni Zhao,Tong Lu,Chew Lim Tan
DOI: https://doi.org/10.1016/j.cviu.2014.09.003
IF: 4.886
2014-01-01
Computer Vision and Image Understanding
Abstract:In this paper, we present new features based on Spatial-Gradient-Features (SGF) at block level for identifying six video scripts namely, Arabic, Chinese, English, Japanese, Korean and Tamil. This works helps in enhancing the capability of the current OCR on video text recognition by choosing an appropriate OCR engine when video contains multi-script frames. The input for script identification is the text blocks obtained by our text frame classification method. For each text block, we obtain horizontal and vertical gradient information to enhance the contrast of the text pixels. We divide the horizontal gradient block into two equal parts as upper and lower at the centroid in the horizontal direction. Histogram on the horizontal gradient values of the upper and the lower part is performed to select dominant text pixels. In the same way, the method selects dominant pixels from the right and the left parts obtained by dividing the vertical gradient block vertically. The method combines the horizontal and the vertical dominant pixels to obtain text components. Skeleton concept is used to reduce pixel width to a single pixel to extract spatial features. We extract four features based on proximity between end points, junction points, intersection points and pixels. The method is evaluated on 770 frames of six scripts in terms of classification rate and is compared with an existing method. We have achieved 82.1% average classification rate.
What problem does this paper attempt to address?