Robust video hashing via multilinear subspace projections

Mu Li,Vishal Monga
DOI: https://doi.org/10.1109/TIP.2012.2206036
Abstract:The goal of video hashing is to design hash functions that summarize videos by short fingerprints or hashes. While traditional applications of video hashing lie in database searches and content authentication, the emergence of websites such as YouTube and DailyMotion poses a challenging problem of anti-piracy video search. That is, hashes or fingerprints of an original video (provided to YouTube by the content owner) must be matched against those uploaded to YouTube by users to identify instances of "illegal" or undesirable uploads. Because the uploaded videos invariably differ from the original in their digital representation (owing to incidental or malicious distortions), robust video hashes are desired. We model videos as order-3 tensors and use multilinear subspace projections, such as a reduced rank parallel factor analysis (PARAFAC) to construct video hashes. We observe that, unlike most standard descriptors of video content, tensor-based subspace projections can offer excellent robustness while effectively capturing the spatio-temporal essence of the video for discriminability. We introduce randomization in the hash function by dividing the video into (secret key based) pseudo-randomly selected overlapping sub-cubes to prevent against intentional guessing and forgery. Detection theoretic analysis of the proposed hash-based video identification is presented, where we derive analytical approximations for error probabilities. Remarkably, these theoretic error estimates closely mimic empirically observed error probability for our hash algorithm. Furthermore, experimental receiver operating characteristic (ROC) curves reveal that the proposed tensor-based video hash exhibits enhanced robustness against both spatial and temporal video distortions over state-of-the-art video hashing techniques.
What problem does this paper attempt to address?