Abstract:Quadratic programming is a ubiquitous prototype in convex programming. Many combinatorial optimizations on graphs and machine learning problems can be formulated as quadratic programming; for example, Support Vector Machines (SVMs). Linear and kernel SVMs have been among the most popular models in machine learning over the past three decades, prior to the deep learning era. Generally, a quadratic program has an input size of $\Theta(n^2)$, where $n$ is the number of variables. Assuming the Strong Exponential Time Hypothesis ($\textsf{SETH}$), it is known that no $O(n^{2-o(1)})$ algorithm exists (Backurs, Indyk, and Schmidt, NIPS'17). However, problems such as SVMs usually feature much smaller input sizes: one is given $n$ data points, each of dimension $d$, with $d \ll n$. Furthermore, SVMs are variants with only $O(1)$ linear constraints. This suggests that faster algorithms are feasible, provided the program exhibits certain underlying structures. In this work, we design the first nearly-linear time algorithm for solving quadratic programs whenever the quadratic objective has small treewidth or admits a low-rank factorization, and the number of linear constraints is small. Consequently, we obtain a variety of results for SVMs: * For linear SVM, where the quadratic constraint matrix has treewidth $\tau$, we can solve the corresponding program in time $\widetilde O(n\tau^{(\omega+1)/2}\log(1/\epsilon))$; * For linear SVM, where the quadratic constraint matrix admits a low-rank factorization of rank-$k$, we can solve the corresponding program in time $\widetilde O(nk^{(\omega+1)/2}\log(1/\epsilon))$; * For Gaussian kernel SVM, where the data dimension $d = \Theta(\log n)$ and the squared dataset radius is small, we can solve it in time $O(n^{1+o(1)}\log(1/\epsilon))$. We also prove that when the squared dataset radius is large, then $\Omega(n^{2-o(1)})$ time is required.

Polynomial-time computing over quadratic maps I: sampling in real algebraic sets

Exact Parameterized Multilinear Monomial Counting via k-Layer Subset Convolution and k-Disjoint Sum.

Almost-Uniform Sampling of Points on High-Dimensional Algebraic Varieties

Almost Every Real Quadratic Polynomial has a Poly-time Computable Julia Set

Quadratic maps between non-abelian groups

The Complexity of Computing KKT Solutions of Quadratic Programs

Faster Algorithms for Structured Linear and Kernel Support Vector Machines

Quadratic-time computations for pseudo-Anosov mapping classes

Convergence of a Series Whose Terms Are Iterates of Quadratic Maps

A stratified polyhedral homotopy method for sampling positive dimensional zero sets of polynomial systems

Random sampling and polynomial-free interpolation by Generalized MultiQuadrics

Polynomial correspondences expressible as maps of $d$-tuples

Comparative Analysis of Polynomials with Their Computational Costs

Randomized matrix-free quadrature: unified and uniform bounds for stochastic Lanczos quadrature and the kernel polynomial method

On tractable exponential sums

On converses to the polynomial method

Quadratic Advantage with Quantum Randomized Smoothing Applied to Time-Series Analysis

Polynomially Solvable Cases Of Binary Quadratic Programs

Solving Polynomial Equations Over Finite Fields

A method for command identification, using modified collision free hashing with addition & rotation iterative hash functions (part 1)

Improved Algebraic Degeneracy Testing