PKU-AIGI-500K: A Neural Compression Benchmark and Model for AI-Generated Images

Xunxu Duan,Siwei Ma,Hongbin Liu,Chuanmin Jia
DOI: https://doi.org/10.1109/jetcas.2024.3385629
IF: 5.877
2024-01-01
IEEE Journal on Emerging and Selected Topics in Circuits and Systems
Abstract:In recent years, artificial intelligence-generated content (AIGC) enabled by foundation models has received increasing attention and is undergoing remarkable development. Text prompts can be elegantly translated/converted into high-quality, photo-realistic images. This remarkable feature, however, has introduced extremely high bandwidth requirements for compressing and transmitting the vast number of AI-generated images (AIGI) for such AIGC services. Despite this challenge, research on compression methods for AIGI is conspicuously lacking but undeniably necessary. This research addresses this critical gap by introducing the pioneering AIGI dataset, PKU-AIGI-500K, encompassing over 105k+ diverse prompts and 528k+ images derived from five major foundation models. Through this dataset, we delve into exploring and analyzing the essential characteristics of AIGC images and empirically prove that existing data-driven lossy compression methods achieve sub-optimal or less efficient rate-distortion performance without fine-tuning, primarily due to a domain shift between AIGIs and natural images. We comprehensively benchmark the rate-distortion performance and runtime complexity analysis of conventional and learned image coding solutions that are openly available, uncovering new insights for emerging studies in AIGI compression. Moreover, to harness the full potential of redundant information in AIGI and its corresponding text, we propose an AIGI compression model (Cross-Attention Transformer Codec, CATC) trained on this dataset as a strong baseline. Subsequent experimental results demonstrate that our proposed model achieves up to 30.09% bitrate reduction compared to the state-of-the-art (SOTA) H.266/VVC codec and outperforms the SOTA learned codec, paving the way for future research in AIGI compression.
What problem does this paper attempt to address?