Abstract:For promising vision-based video coding on low-quality data, this paper proposes a sparse spatio-temporal representation with adaptive regularized dictionary learning and develops a low bit-rate video coding scheme. In a reversed-complexity Wyner-Ziv coding manner, it selects a subset of key frames to code at original resolution, while the rest are down sampled and reconstructed by a sparse spatio-temporal approximation using key frames as a training dataset. Since primitive patches (geometry) are of low dimensionality and can be well learned from the primitive patches across frames in a scale space, a video frame is divided into three layers: a primitive layer, a nonprimitive coarse layer, and a nonprimitive smooth layer. The multiscale differential feature representations are invertible to facilitate reconstruction with dictionary learning, and the target is formulated as an optimization problem by constructing a sparse representation of 2-D patches and 3-D volumes over adaptive regularized dictionaries, a set of 2-D subdictionary pairs trained from primitive patches, and a 3-D dictionary trained from nonprimitive volumes. Specifically, the nonprimitive layer is constructed as volumes in to order keep it consistent along the motion trajectory, which enables sparse representations over a learned 3-D spatio-temporal dictionary. Through hierarchical bidirectional motion estimation and adaptive overlapped block motion compensation, the 3-D low-frequency and high-frequency dictionary pair is designed by the K-SVD algorithm to update the atoms for optimal sparse representation and convergence. In reconstruction, the lost high-frequency information of the down-sampled frames can be synthesized from the sparse spatio-temporal representation over the adaptive regularized dictionaries. Extensive experiments validate the compression efficiency of the proposed scheme versus H.264/AVC in terms of both objective and subjective comparisons.

Progressive Dictionary Learning with Hierarchical Structure for Scalable Video Coding

Multiscale Online Dictionary Learning for Quality Scalable Video Coding

Deep Hierarchical Video Compression

Sparse Spatio-Temporal Representation with Adaptive Regularized Dictionary Learning for Low Bit-Rate Video Coding

Hierarchical Piece-Wise Linear Projections for Efficient Intra-Prediction Coding.

Dct-Prediction Based Progressive Fine Granularity Scalable Coding

Sparse Spatio-Temporal Representation with Adaptive Regularized Dictionaries for Super-Resolution Based Video Coding

Sparse Representation with Spatio-Temporal Online Dictionary Learning for Promising Video Coding

High-Efficiency Neural Video Compression via Hierarchical Predictive Learning

Online Dictionary Learning Based Intra-Frame Video Coding

Generalized In-Scale Motion Compensation Framework for Spatial Scalable Video Coding.

Multi-view Video Coding Based on View Prediction

Adaptive Prediction Structure for Learned Video Compression

Constant Quality Control for Hierarchical-B Structure in Scalable Video Coding

In-Scale Motion Compensation for Spatially Scalable Video Coding

Scalable Structured Compressive Video Sampling with Hierarchical Subspace Learning

Deep Reference Generation with Multi-Domain Hierarchical Constraints for Inter Prediction

End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation

Deep Video Compression with Scaled Hierarchical Bi-directional Motion Model

Temporal Scalable Video Transmission Using Multi-Reference Prediction Chain Coding

Learned Hierarchical B-frame Coding with Adaptive Feature Modulation for YUV 4:2:0 Content