Abstract:Detection of double compression is regarded as one primary step in analyzing the integrity of digital videos, which is of prominent importance in video forensics. However, current methods are vulnerable with the severe lossy quantization in the recompression process such that it is challenging to obtain reliable frame-wise detection results, especially for the high efficiency video coding (HEVC) standard. In view of these issues, in this paper, a hybrid neural network is proposed to reveal abnormal frames in HEVC videos with double compression by learning robust spatio-temporal representations from coding information in the compression domain. Based on the statistical analysis of Coding Units (CUs), it is interesting to find that HEVC video streams contain "rich" coding information that could be leveraged to identify abnormal traces caused by double compression. Two types of coding information maps, including CU Size Map (CSM) and CU Prediction mode Map (CPM), are exploited. In contrast with the conventional paradigm relying on pixel-level representations of decoded frames, CSMs and CPMs of a short-time video clip are treated as the input, aiming to achieve high robustness against recompression of low quality. In our hybrid neural network, an attention-based two-stream residual network is proposed to learn hierarchical representations from CSM and CPM, which are then jointly optimized by the attention-based fusion module. Finally, the temporal variation is modeled by Long Short-Term Memory (LSTM) to obtain frame-wise detection results. We have conducted extensive experiments considering various video content and coding parameters, such as bitrates and sizes of Group of Picture. Experimental results show that our approach can obtain state-of-the-art performance compared with conventional methods, especially when videos are recompressed in the low bitrate coding scenarios.

Video Frame Prediction with Dual-Stream Deep Network Emphasizing Motions and Content Details.

Adaptive Hierarchical Motion-Focused Model for Video Prediction.

Frame Prediction Using Recurrent Convolutional Encoder with Residual Learning

Video Frame Prediction by Deep Multi-Branch Mask Network

Edge Guided Generation Network for Video Prediction

Dual Motion GAN for Future-Flow Embedded Video Prediction

Video prediction: a step-by-step improvement of a video synthesis network

Decomposing Motion and Content for Natural Video Sequence Prediction

Motion-Aware Feature Enhancement Network for Video Prediction

Adaptive Recurrent Frame Prediction with Learnable Motion Vectors.

Deep Learned Frame Prediction for Video Compression

Deep Frame Prediction for Video Coding

Predictive Coding Based Multiscale Network with Encoder-Decoder LSTM for Video Prediction

Predicting Diverse Future Frames with Local Transformation-Guided Masking.

A lightweight multi-granularity asymmetric motion mode video frame prediction algorithm

Anti-aliasing Predictive Coding Network for Future Video Frame Prediction

Motion Selective Prediction for Video Frame Synthesis

Exploring and Exploiting High-Order Spatial-Temporal Dynamics for Long-Term Frame Prediction

Combined Deterministic and Stochastic Streams for Visual Prediction Using Predictive Coding

Structure Preserving Video Prediction

Frame-Wise Detection Of Double Hevc Compression By Learning Deep Spatio-Temporal Representations In Compression Domain