Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

Dan Jacobellis,Daniel Cummings,Neeraja J. Yadwadkar

2024-01-16

Abstract:In the field of neural data compression, the prevailing focus has been on optimizing algorithms for either classical distortion metrics, such as PSNR or SSIM, or human perceptual quality. With increasing amounts of data consumed by machines rather than humans, a new paradigm of machine-oriented compression$\unicode{x2013}$which prioritizes the retention of features salient for machine perception over traditional human-centric criteria$\unicode{x2013}$has emerged, creating several new challenges to the development, evaluation, and deployment of systems utilizing lossy compression. In particular, it is unclear how different approaches to lossy compression will affect the performance of downstream machine perception tasks. To address this under-explored area, we evaluate various perception models$\unicode{x2013}$including image classification, image segmentation, speech recognition, and music source separation$\unicode{x2013}$under severe lossy compression. We utilize several popular codecs spanning conventional, neural, and generative compression architectures. Our results indicate three key findings: (1) using generative compression, it is feasible to leverage highly compressed data while incurring a negligible impact on machine perceptual quality; (2) machine perceptual quality correlates strongly with deep similarity metrics, indicating a crucial role of these metrics in the development of machine-oriented codecs; and (3) using lossy compressed datasets, (e.g. ImageNet) for pre-training can lead to counter-intuitive scenarios where lossy compression increases machine perceptual quality rather than degrading it. To encourage engagement on this growing area of research, our code and experiments are available at:

Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning,Sound,Audio and Speech Processing

What problem does this paper attempt to address?

The paper attempts to address the issue of how different severe lossy compression methods affect the performance of audio and image models in machine perception tasks. As more and more data is consumed by machines rather than humans, a new machine-oriented compression paradigm is emerging, which prioritizes preserving features important for machine perception over traditional human-centric standards. However, it is unclear how different types of lossy compression affect the performance of downstream machine perception tasks. To this end, the paper evaluates the performance of various perception models (including image classification, image segmentation, speech recognition, and music source separation) under severe lossy compression, using several popular codecs that cover traditional, neural network, and generative compression architectures. Specifically, the paper aims to systematically evaluate the impact of various types of traditional and neural network lossy compression techniques on audio and visual machine learning tasks. By understanding these impacts, the authors hope to bridge the gap between the potential advantages of advanced lossy compression techniques and their practical application in machine learning pipelines.

Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

On the Impact of Perceptual Compression on Deep Learning

Machine Perception-Driven Image Compression: A Layered Generative Approach

Perceptual impact of the loss function on deep-learning image coding performance

Learned Image Compression for Machine Perception

Perceptually Optimizing Deep Image Compression

The Helmholtz Method: Using Perceptual Compression to Reduce Machine Learning Complexity

Understanding The Effectiveness of Lossy Compression in Machine Learning Training Sets

On the Impact of Lossy Image and Video Compression on the Performance of Deep Convolutional Neural Network Architectures

Deep Perceptual Compression

Towards improved lossy image compression: Human image reconstruction with public-domain images

On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework

Exploring Compressed Image Representation as a Perceptual Proxy: A Study

Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff

A New Image Codec Paradigm for Human and Machine Uses

Machine vision-aware quality metrics for compressed image and video assessment

Research on Application of Perceptual Model Based Image Compression

Perceptual Quality Study on Deep Learning based Image Compression

Lossy Compression with Data, Perception, and Classification Constraints

Quality Assessment of End-to-End Learned Image Compression

Analyzing and Mitigating JPEG Compression Defects in Deep Learning