Multi-Strategy Tracking Based Text Detection in Scene Videos

Ze-Yu Zuo,Shu Tian,Wei-yi Pei,Xu-Cheng Yin
DOI: https://doi.org/10.1109/icdar.2015.7333727
2015-01-01
Abstract:Text detection and tracking in scene videos are important prerequisites for content-based video analysis and retrieval, wearable camera systems and mobile devices augmented reality translators. Here, we present a novel multi-strategy tracking based text detection approach in scene videos. In this approach, a state-of-the-art scene text detection module [1] is first used to detect text in each video frame. Then a multi-strategy text tracking technique is proposed, which uses tracking by detection, spatio-temporal context learning, and linear prediction to predict the candidate text location sequentially, and adaptively integrates and selects the best matching text block from the candidate blocks with a rule-based method. This multi-strategy tracking technique can combine the advantages of the three different tracking techniques and afterwards make remedies to the disadvantages of them. Experiments on a variety of scene videos show that our proposed approach is effective and robust to reduce false alarm and improve the accuracy of detection.
What problem does this paper attempt to address?