Abstract:Support vector machines (SVMs) are successful supervised learning models that analyze data for classification and regression. Previous work has demonstrated the superiority of the SVMs in dealing with the high dimensional, low sample size problems. However, the increase of the sample size brings great challenges to accurately and efficiently solve the large-scale SVMs, especially for the nonlinear kernel SVMs, which may lead to huge computational costs and unaffordable storage burden. In this paper, we propose a highly efficient sparse semismooth Newton (SsN) based augmented Lagrangian (AL) method for solving a class of large-scale SVMs that can be formulated as a convex quadratic programming problem with a linear equality constraint and a simple box constraint. The asymptotic superlinear convergence rate of both the primal and the dual iteration sequences generated by the AL method is guaranteed due to the piecewise linear-quadratic structure of the problem. Furthermore, we reveal the close connection between the number of support vectors and the sparse structure of the generalized Jacobian for the inner subproblem of the AL method. By exploiting this hidden sparsity, the inner subproblem can be solved by the SsN method efficiently and accurately, which greatly reduces the storage burden and computational costs. In particular, for the nonlinear kernel SVMs, since the sparse structure may not manifest in the early iterations of the AL method, we solve a linear kernel SVM approximated by the random Fourier features method to produce a good initial point, and then transfer to solve the original problem. Numerical experiments demonstrate that the proposed algorithm outperforms the current state-of-the-art solvers for the large-scale SVMs.

Approximate Approach to Train SVM on Very Large Data Sets

A Parallel and Scalable Digital Architecture for Training Support Vector Machines

A Scalable Hardware Implementation of the Support Vector Machine

A Novel Approach to Incremental SVM Learning Algorithm

A Geometric Approach To Train Svm On Very Large Data Sets

An Incremental Updating Method for Support Vector Machines

Incremental batch learning with support vector machines

Support Vector Pursuit Learning

Incremental Learning Algorithm Based on Support Vector Machine

An Online Incremental Learning Support Vector Machine for Large-Scale Data

New method of SVM learning with large scale training data

A hybrid method for speeding SVM training

Using Support Vector Machines for Mining Regression Classes in Large Data Sets

An Efficient Algorithm for a Class of Large-Scale Support Vector Machines Exploiting Hidden Sparsity.

On-line Support Vector Machine Training Algorithm and Its Application

A sparse semismooth Newton based augmented Lagrangian method for large-scale support vector machines

One-class support vector machines for large-scale data sets

Mini-batch Quasi-Newton Optimization for Large Scale Linear Support Vector Regression

Fast SVM training using edge detection on very large datasets

An improved training algorithm for support vector machines