Abstract:Retrieving desired information from databases containing video, natural scene, and license plate images through keyword spotting is a big challenge to expert systems due to different complexities that occur because of background and foreground variations of texts in real-time environments. To reduce background complexity of input images, we introduce a new model based on fractional means that considers neighboring information of pixels to widen the gap between text and background. To do so, the process obtains text candidates with the help of k-means clustering. The proposed approach explores the combination of Radon and Fourier coefficients to define context features based on regular patterns given by coefficient distributions for foreground and background of text candidates. This process eliminates non text candidates regardless of different font types and sizes, colors, orientations and scripts, and results in representatives of texts. The proposed approach then exploits the fact that text pixels share almost the same values to restore missing text components using Canny edge image by proposing a new idea of minimum cost path based ring growing, and then outputs keywords. Furthermore, the proposed approach extracts the same above-mentioned features locally and globally for spotting words from images. Experimental results on different benchmark databases, namely, ICDAR 2013, ICDAR 2015, YVT, NUS video data, ICDAR 2013, ICDAR 2015, SVT, MSRA, UCSC, Medialab and Uninusubria license plate data show that the proposed method is effective and useful compared to the existing methods. (C) 2018 Elsevier Ltd. All rights reserved.

New Texture-Spatial Features for Keyword Spotting in Video Images

A new video text detection method.

A Novel Approach to Text Detection and Extraction from Videos by Discriminative Features and Density

Sharpness and Contrast Based Features for Word-Wise Video Type Classification

Video Identification Using Spatio-temporal Salient Points

Fractional Means Based Method for Multi-Oriented Keyword Spotting in Video/scene/license Plate Images.

A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video

Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring

A New Method For Spatiotemporal Textual Saliency Detection In Video

Multi-Script-Oriented Text Detection and Recognition in Video/Scene/Born Digital Images

Real-time End-to-End Video Text Spotter with Contrastive Representation Learning

Recognition of Video Text Through Temporal Integration

Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene Text Detection in Video Images

New Tampered Features for Scene and Caption Text Classification in Video Frame.

A Robust Color-Independent Text Detection Method from Complex Videos

New Sharpness Features for Image Type Classification Based on Textual Information

Audio-visual Keyword Spotting for Mandarin Based on Discriminative Local Spatial-Temporal Descriptors.

You Only Recognize Once: Towards Fast Video Text Spotting

Audio-visual Keyword Spotting Based on Adaptive Decision Fusion under Noisy Conditions for Human-Robot Interaction.

Contour Restoration of Text Components for Recognition in Video/Scene Images.

Towards Accurate Video Text Spotting with Text-wise Semantic Reasoning