Consistency Guided Diffusion Model with Neural Syntax for Perceptual Image Compression

Haowei Kuang,Yiyang Ma,Wenhan Yang,Zongming Guo,Jiaying Liu
DOI: https://doi.org/10.1145/3664647.3681336
2024-01-01
Abstract:Diffusion models show impressive performances in image generation with excellent perceptual quality. However, its tendency to introduce additional distortion prevents its direct application in image compression. To address the issue, this paper introduces a Consistency Guided Diffusion Model (CGDM) tailored for perceptual image compression, which integrates an end-to-end image compression model with a diffusion-based post-processing network, aiming to learn richer detail representations with less fidelity loss. In detail, the compression and post-processing networks are cascaded and a branch of consistency guided features is added to constrain the deviation in the diffusion process for better reconstruction quality. Furthermore, a Syntax driven Feature Fusion (SFF) module is constructed to take an extra ultra-low bitstream from the encoding end as input, guiding the adaptive fusion of information from the two branches. In addition, we design a globally uniform boundary control strategy with overlapped patches and adopt a continuous online optimization mode to improve both coding efficiency and global consistency. Extensive experiments validate the superiority of our method to existing perceptual compression techniques. Our project is publicly available at: https://ellisonkuang.github.io/CGDM.github.io/.
What problem does this paper attempt to address?