Abstract:We present novel reductions from sample compression schemes in multiclass classification, regression, and adversarially robust learning settings to binary sample compression schemes. Assuming we have a compression scheme for binary classes of size $f(d_\mathrm{VC})$, where $d_\mathrm{VC}$ is the VC dimension, then we have the following results: (1) If the binary compression scheme is a majority-vote or a stable compression scheme, then there exists a multiclass compression scheme of size $O(f(d_\mathrm{G}))$, where $d_\mathrm{G}$ is the graph dimension. Moreover, for general binary compression schemes, we obtain a compression of size $O(f(d_\mathrm{G})\log|Y|)$, where $Y$ is the label space. (2) If the binary compression scheme is a majority-vote or a stable compression scheme, then there exists an $\epsilon$-approximate compression scheme for regression over $[0,1]$-valued functions of size $O(f(d_\mathrm{P}))$, where $d_\mathrm{P}$ is the pseudo-dimension. For general binary compression schemes, we obtain a compression of size $O(f(d_\mathrm{P})\log(1/\epsilon))$. These results would have significant implications if the sample compression conjecture, which posits that any binary concept class with a finite VC dimension admits a binary compression scheme of size $O(d_\mathrm{VC})$, is resolved (Littlestone and Warmuth, 1986; Floyd and Warmuth, 1995; Warmuth, 2003). Our results would then extend the proof of the conjecture immediately to other settings. We establish similar results for adversarially robust learning and also provide an example of a concept class that is robustly learnable but has no bounded-size compression scheme, demonstrating that learnability is not equivalent to having a compression scheme independent of the sample size, unlike in binary classification, where compression of size $2^{O(d_\mathrm{VC})}$ is attainable (Moran and Yehudayoff, 2016).

Classification Via Minimum Incremental Coding Length

A Rate-Distortion-Classification Approach for Lossy Image Compression

Lossless Coding with Generalised Criteria

An Interpretable Compression and Classification System: Theory and Applications

Neural Normalized Compression Distance and the Disconnect Between Compression and Classification

Learning Binary Codes and Binary Weights for Efficient Classification

Segmentation of Multivariate Mixed Data Via Lossy Coding and Compression

Large-Margin Learning of Compact Binary Image Encodings

Linear Distance Coding for Image Classification

Design of a brief perceptual loss function with Hadamard codes

Sample Compression Scheme Reductions

Minimum Entropy Coupling with Bottleneck

Probing Image Compression For Class-Incremental Learning

Binary Linear Compression for Multi-label Classification.

Optimal Compression of Unit Norm Vectors in the High Distortion Regime

Importance Matching Lemma for Lossy Compression with Side Information

Optimally Controllable Perceptual Lossy Compression

Lossy Compression Via Sparse Regression Codes: an Approximate Message Passing Approach.

Efficient Maximal Coding Rate Reduction by Variational Forms

Restricted Minimum Error Entropy Criterion for Robust Classification

An efficient, provably exact, practical algorithm for the 0-1 loss linear classification problem