Abstract:Hashing is a common solution for content-based multimedia retrieval by encoding high-dimensional feature vectors into short binary codes. Previous works mainly focus on image hashing problem. However, these methods can not be directly used for video hashing, as videos contain not only spatial structure within each frame, but also temporal correlation between successive frames. Several researchers proposed to handle this by encoding the extracted key frames, but these frame-based methods are time-consuming in real applications. Other researchers proposed to characterize the video by averaging the spatial features of frames and then the existing hashing methods can be adopted. Unfortunately, the sort of “video” features does not take the correlation between frames into consideration and may lead to the loss of the temporal information. Therefore, in this paper, we propose a novel unsupervised video hashing framework via deep neural network, which performs video hashing by incorporating the temporal structure as well as the conventional spatial structure. Specially, the spatial features of videos are obtained by utilizing convolutional neural network, and the temporal features are established via long-short term memory. After that, the time series pooling strategy is employed to obtain the single feature vector for each video. The obtained spatio-temporal feature can be applied to many existing unsupervised hashing methods. Experimental results on two real datasets indicate that by employing the spatio-temporal features, our hashing method significantly improves the performance of existing methods which only deploy the spatial features, and meanwhile obtains higher mean average precision compared with the state-of-the-art video hashing methods.

Short Video Fingerprint Extraction: from Audio–visual Fingerprint Fusion to Multi-Index Hashing

Video Forensics Research Based on Authenticity and Integrity.

New Fusional Framework Combining Sparse Selection and Clustering for Key Frame Extraction.

Heterogeneous Hashing Network for Face Retrieval Across Image and Video Domains

A robust and fast video fingerprinting based on 3D-DCT and LSH

An Improved Video Identification Scheme Based on Video Tomography.

Learning Hierarchical Fingerprints via Multi-Level Fusion for Video Integrity and Source Analysis

A Novel Feature Fusion Based Framework for Efficient Shot Indexing to Massive Web Videos

A method for video authenticity based on the fingerprint of scene frame

Efficient Video Hashing Based on Low‐rank Frames

Video abstraction based on the visual attention model and online clustering

Unsupervised Video Hashing via Deep Neural Network

A novel video abstraction method based on fast clustering of the regions of interest in key frames

Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos

Compact and Robust Video Fingerprinting Using Sparse Represented Features.

Video fingerprint detecting and video sequence matching method and system based on visual features

A Supervised Video Hashing Method Based on a Deep 3D Convolutional Neural Network for Large-Scale Video Retrieval

Multi-granularity Geometrically Robust Video Hashing for Tampering Detection

Video Data Hierarchical Retrieval Via Deep Hash Method

Submodular video hashing: a unified framework towards video pooling and indexing.

Unsupervised Video Hashing by Exploiting Spatio-Temporal Feature