Generative Visual Compression: A Review

Bolin Chen,Shanzhi Yin,Peilin Chen,Shiqi Wang,Yan Ye

2024-02-03

Abstract:Artificial Intelligence Generated Content (AIGC) is leading a new technical revolution for the acquisition of digital content and impelling the progress of visual compression towards competitive performance gains and diverse functionalities over traditional codecs. This paper provides a thorough review on the recent advances of generative visual compression, illustrating great potentials and promising applications in ultra-low bitrate communication, user-specified reconstruction/filtering, and intelligent machine analysis. In particular, we review the visual data compression methodologies with deep generative models, and summarize how compact representation and high-fidelity reconstruction could be actualized via generative techniques. In addition, we generalize related generative compression technologies for machine vision and intelligent analytics. Finally, we discuss the fundamental challenges on generative visual compression techniques and envision their future research directions.

Computer Vision and Pattern Recognition,Image and Video Processing

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is **how to use generative models (such as Variational Auto - Encoder (VAE), Generative Adversarial Network (GAN) and Diffusion Model (DM)) to achieve efficient visual data compression** in order to obtain high - quality visual reconstruction at the minimum encoding cost. Specifically, the paper mainly focuses on the following aspects: 1. **Improving compression efficiency**: Traditional image and video compression algorithms (such as H.264/AVC, H.265/HEVC and H.266/VVC) have bottlenecks when dealing with large - scale visual data. Generative models can learn more compact feature representations, thus achieving a higher compression ratio. 2. **Enhancing reconstruction quality**: Generative models can not only compress data, but also reconstruct high - quality images or videos from these compact features through powerful reasoning abilities. Especially in application scenarios such as ultra - low - bit - rate communication, user - specified reconstruction/filtering and intelligent machine analysis, generative models have shown great potential. 3. **Diversified functions**: Generative models can support multiple advanced functions, such as cross - modal encoding (encoding images into text), conceptual encoding (decomposing images into structural information and texture codes), temporal evolution encoding (using inter - frame motion information for compression) and full - dimensional data encoding (such as 3D point clouds and panoramas). 4. **Applications in machine vision**: In addition to human vision, generative models can also be used for machine vision tasks to ensure that the compressed visual data can still maintain high task performance. This includes two methods, pixel - domain analysis and feature - domain analysis, which are optimized for different types of machine tasks respectively. In summary, this paper aims to explore the latest progress of generative models in the field of visual compression and look forward to its future research directions, especially how to overcome the current challenges, such as the selection of evaluation metrics, the improvement of robustness and generalization ability, task - independent compression and communication design, and standardization and deployment.

Generative Visual Compression: A Review

Rethinking Image Compression on the Web with Generative AI

Machine Perception-Driven Image Compression: A Layered Generative Approach

General Generative Model-Based Image Compression Method Using an Optimisation Encoder.

Guest Editorial Advances in Generative Visual Signal Coding and Processing

Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

Evolution and future directions of Artificial Intelligence Generated Content (AIGC): A comprehensive review

Generative Face Video Coding Techniques and Standardization Efforts: A Review

Survey on Visual Signal Coding and Processing with Generative Models: Technologies, Standards and Optimization

Image and Video Compression with Neural Networks: A Review

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Collaborative Scalable Visual Compression for Human-Centered Videos.

Towards Defining an Efficient and Expandable File Format for AI-Generated Contents

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression

PKU-AIGI-500K: A Neural Compression Benchmark and Model for AI-Generated Images

Generative AI for Visualization: State of the Art and Future Directions

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Facial Image Compression via Neural Image Manifold Compression

EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation

Central mechanisms of experimental and chronic neuropathic pain: Findings from functional imaging studies