New Equivalences Between Interpolation and SVMs: Kernels and Structured Features

Chiraag Kaushik,Andrew D. McRae,Mark A. Davenport,Vidya Muthukumar
2023-05-04
Abstract:The support vector machine (SVM) is a supervised learning algorithm that finds a maximum-margin linear classifier, often after mapping the data to a high-dimensional feature space via the kernel trick. Recent work has demonstrated that in certain sufficiently overparameterized settings, the SVM decision function coincides exactly with the minimum-norm label interpolant. This phenomenon of support vector proliferation (SVP) is especially interesting because it allows us to understand SVM performance by leveraging recent analyses of harmless interpolation in linear and kernel models. However, previous work on SVP has made restrictive assumptions on the data/feature distribution and spectrum. In this paper, we present a new and flexible analysis framework for proving SVP in an arbitrary reproducing kernel Hilbert space with a flexible class of generative models for the labels. We present conditions for SVP for features in the families of general bounded orthonormal systems (e.g. Fourier features) and independent sub-Gaussian features. In both cases, we show that SVP occurs in many interesting settings not covered by prior work, and we leverage these results to prove novel generalization results for kernel SVM classification.
Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily explores the behavior of Support Vector Machines (SVM) in over-parameterized settings, particularly in kernel methods. Specifically: 1. **Support Vector Proliferation (SVP) Phenomenon**: - The paper investigates the phenomenon where, in certain over-parameterized settings, the decision function of the SVM coincides exactly with the minimum norm interpolation, known as Support Vector Proliferation (SVP). This phenomenon allows the performance of SVMs to be understood by studying the minimum norm interpolation. 2. **Theoretical Framework**: - A new analytical framework is proposed to demonstrate that in any Reproducing Kernel Hilbert Space (RKHS), the labels under different generative models satisfy the SVP condition. This includes general bounded orthogonal systems (such as Fourier features) and independent sub-Gaussian features. 3. **Generalization Performance**: - Using these results, new generalization results for kernel SVM classification are proven, especially in highly parameterized scenarios where traditional generalization bounds do not apply. 4. **Extension of Existing Work**: - Compared to previous work, the paper relaxes assumptions on data distribution and dimensionality, allowing for more structured feature mappings and not relying on the independence or sub-Gaussianity of features. 5. **Experimental Validation**: - Through specific examples (such as the double-layer ensemble model), the conditions under which SVP occurs in particular settings are demonstrated, and its effectiveness in highly structured features is validated. In summary, the paper aims to deepen the understanding of SVM behavior in over-parameterized settings and provides new theoretical tools to analyze its performance in complex feature spaces.