MMH: Multi-Modal Hash for Instant Mobile Video Search

Wenhui Gao,Xinchen Liu,Huadong Ma,Yanan Li,Liang Liu
DOI: https://doi.org/10.1109/MIPR.2018.00018
2018-01-01
Abstract:Mobile devices have been an indispensable part of human life, which enable people to search and browse what they want on the move. Mobile video search, as one of the most important services for users, still faces great challenges under mobile internet scenario, such as the limitation of computation ability, memory, and bandwidth. Therefore, this paper proposes a multi-modal hash based framework for instant mobile video search. In particular, we adopt a efficient deep convolutional neural network, MobileNet, with the hash layer to learn discriminative and compact visual features from videos. Moreover, we also consider hand-crafted local visual descriptor and audio fingerprint to build a multi-modal hash representation of videos. With the multi-modal hash code, two types of hash indexes are built on the server to achieve efficient video search. At last, the multi-modal hash codes are extracted on the mobile devices and transferred in a three- step progressive procedure during the online search stage. The experiments on the real-world dataset show that the proposed framework not only achieves the state-of-the-art accuracy but also obtains excellent efficiency.
What problem does this paper attempt to address?