Abstract:The deep learning revolution incited by the 2012 Alexnet paper has been transformative for the field of computer vision. Many problems which were severely limited using classical solutions are now seeing unprecedented success. The rapid proliferation of deep learning methods has led to a sharp increase in their use in consumer and embedded applications. One consequence of consumer and embedded applications is lossy multimedia compression which is required to engineer the efficient storage and transmission of data in these real-world scenarios. As such, there has been increased interest in a deep learning solution for multimedia compression which would allow for higher compression ratios and increased visual quality. The deep learning approach to multimedia compression, so called Learned Multimedia Compression, involves computing a compressed representation of an image or video using a deep network for the encoder and the decoder. While these techniques have enjoyed impressive academic success, their industry adoption has been essentially non-existent. Classical compression techniques like JPEG and MPEG are too entrenched in modern computing to be easily replaced. This dissertation takes an orthogonal approach and leverages deep learning to improve the compression fidelity of these classical algorithms. This allows the incredible advances in deep learning to be used for multimedia compression without threatening the ubiquity of the classical methods. The key insight of this work is that methods which are motivated by first principles, i.e., the underlying engineering decisions that were made when the compression algorithms were developed, are more effective than general methods. By encoding prior knowledge into the design of the algorithm, the flexibility, performance, and/or accuracy are improved at the cost of generality...

On the Impact of Perceptual Compression on Deep Learning

The Helmholtz Method: Using Perceptual Compression to Reduce Machine Learning Complexity

Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

Learned Image Compression for Machine Perception

Analyzing and Mitigating JPEG Compression Defects in Deep Learning

On the Impact of Lossy Image and Video Compression on the Performance of Deep Convolutional Neural Network Architectures

Perceptual impact of the loss function on deep-learning image coding performance

Perceptually Optimizing Deep Image Compression

Deep Perceptual Compression

The First Principles of Deep Learning and Compression

EARN: Toward Efficient and Robust JPEG Compression Artifact Reduction

CPIPS: Learning to Preserve Perceptual Distances in End-to-End Image Compression

Better Compression With Deep Pre-Editing

First Gradually, Then Suddenly: Understanding the Impact of Image Compression on Object Detection Using Deep Learning

Perceptual Quality Study on Deep Learning based Image Compression

Exploring Compressed Image Representation as a Perceptual Proxy: A Study

Machine Perception-Driven Image Compression: A Layered Generative Approach

Deep Learning-based Compressed Domain Multimedia for Man and Machine: A Taxonomy and Application to Point Cloud Classification

Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Learning-based Compression for Noisy Images in the Wild

On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework