GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

Xinjie Zhang,Xingtong Ge,Tongda Xu,Dailan He,Yan Wang,Hongwei Qin,Guo Lu,Jing Geng,Jun Zhang
2024-07-09
Abstract:Implicit neural representations (INRs) recently achieved great success in image representation and compression, offering high visual quality and fast rendering speeds with 10-1000 FPS, assuming sufficient GPU resources are available. However, this requirement often hinders their use on low-end devices with limited memory. In response, we propose a groundbreaking paradigm of image representation and compression by 2D Gaussian Splatting, named GaussianImage. We first introduce 2D Gaussian to represent the image, where each Gaussian has 8 parameters including position, covariance and color. Subsequently, we unveil a novel rendering algorithm based on accumulated summation. Remarkably, our method with a minimum of 3$\times$ lower GPU memory usage and 5$\times$ faster fitting time not only rivals INRs (e.g., WIRE, I-NGP) in representation performance, but also delivers a faster rendering speed of 1500-2000 FPS regardless of parameter size. Furthermore, we integrate existing vector quantization technique to build an image codec. Experimental results demonstrate that our codec attains rate-distortion performance comparable to compression-based INRs such as COIN and COIN++, while facilitating decoding speeds of approximately 2000 FPS. Additionally, preliminary proof of concept shows that our codec surpasses COIN and COIN++ in performance when using partial bits-back coding. Code is available at <a class="link-external link-https" href="https://github.com/Xinjie-Q/GaussianImage" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Artificial Intelligence,Computer Vision and Pattern Recognition,Multimedia
What problem does this paper attempt to address?
The paper aims to address the issue of efficient image representation and compression, especially for applications on low-resource devices. Traditional Implicit Neural Representations (INRs), while successful in image representation and compression, rely on large multilayer perceptron networks, which lead to long training times, high GPU memory requirements, and slow decoding speeds. To tackle these issues, the paper introduces a novel image representation and compression method based on 2D Gaussian Splatting, named GaussianImage. Specifically, GaussianImage utilizes 2D Gaussian distributions to represent images, with each Gaussian encompassing attributes such as position, covariance matrix, color coefficients, and opacity. Compared to 3D Gaussian Splatting, this method significantly reduces the number of parameters, thereby lowering storage requirements and GPU memory usage. Moreover, the paper proposes a new rendering algorithm based on a cumulative blending mechanism, which not only fully utilizes the information of all Gaussian points covering the current pixel to improve fitting performance but also avoids the cumbersome transparency accumulation calculations found in traditional α-blending, accelerating training and inference speeds. To further enhance compression efficiency, the paper transforms the 2D Gaussian representation into a practical image codec, employing quantization-aware fine-tuning and encoding strategies. By applying techniques such as floating-point quantization, integer quantization, and residual vector quantization, the paper successfully develops the first image codec based on 2D Gaussian Splatting. Experimental results show that compared to existing INR methods, GaussianImage achieves faster training and inference speeds while reducing GPU memory usage and maintaining similar visual quality. As an efficient image codec, GaussianImage provides compression performance comparable to COIN and COIN++, demonstrating its potential in the field of image representation and compression.