Abstract:Mobile video is quickly becoming a mass consumer phenomenon. More and more people are using their smartphones to search and browse video content while on the move. In this paper, we have developed an innovative instant mobile video search system through which users can discover videos by simply pointing their phones at a screen to capture a very few seconds of what they are watching. The system is able to index large-scale video data using a new layered audio-video indexing approach in the cloud, as well as extract light-weight joint audio-video signatures in real time and perform progressive search on mobile devices. Unlike most existing mobile video search applications that simply send the original video query to the cloud, the proposed mobile system is one of the first attempts at instant and progressive video search leveraging the light-weight computing capacity of mobile devices. The system is characterized by four unique properties: 1) a joint audio-video signature to deal with the large aural and visual variances associated with the query video captured by the mobile phone, 2) layered audio-video indexing to holistically exploit the complementary nature of audio and video signals, 3) light-weight fingerprinting to comply with mobile processing capacity, and 4) a progressive query process to significantly reduce computational costs and improve the user experience---the search process can stop anytime once a confident result is achieved. We have collected 1,400 query videos captured by 25 mobile users from a dataset of 600 hours of video. The experiments show that our system outperforms state-of-the-art methods by achieving 90.79% precision when the query video is less than 10 seconds and 70.07% even when the query video is less than 5 seconds.

A Cross-media Retrieval System for Lecture Videos

Content Based Lecture Video Retrieval Using Speech and Video Text Information

The Research of Multimedia Cross Reference Retrieval System

Cross-modal Embeddings for Video and Audio Retrieval

Multimedia Analysis and Retrieval System

A Scalable Video Search Engine Based on Audio Content Indexing and Topic Segmentation

Multimodal Fusion of Speech and Text using Semi-supervised LDA for Indexing Lecture Videos

UATVR: Uncertainty-Adaptive Text-Video Retrieval

The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval

Cross-media retrieval using query dependent search methods

An Interactive Video Search Platform for Multi-modal Retrieval with Advanced Concepts

A MULTI-CHANNEL RETRIEVAL SYSTEM FOR MULTIMEDIA DOCUMENTS

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

An Automated End-to-end Lecture Capture and Broadcasting System

Cross-Media Retrieval: Concepts, Advances And Challenges

ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound

Instant Mobile Video Search with Layered Audio-Video Indexing and Progressive Transmission

The Interactive Video Retrieval System in SMARTV 2009

Listen, look, and gotcha: instant video search with mobile phones by layered audio-video indexing.

Video and Audio are Images: A Cross-Modal Mixer for Original Data on Video-Audio Retrieval