Abstract:Support vector machines (SVMs) are successful supervised learning models that analyze data for classification and regression. Previous work has demonstrated the superiority of the SVMs in dealing with the high dimensional, low sample size problems. However, the increase of the sample size brings great challenges to accurately and efficiently solve the large-scale SVMs, especially for the nonlinear kernel SVMs, which may lead to huge computational costs and unaffordable storage burden. In this paper, we propose a highly efficient sparse semismooth Newton (SsN) based augmented Lagrangian (AL) method for solving a class of large-scale SVMs that can be formulated as a convex quadratic programming problem with a linear equality constraint and a simple box constraint. The asymptotic superlinear convergence rate of both the primal and the dual iteration sequences generated by the AL method is guaranteed due to the piecewise linear-quadratic structure of the problem. Furthermore, we reveal the close connection between the number of support vectors and the sparse structure of the generalized Jacobian for the inner subproblem of the AL method. By exploiting this hidden sparsity, the inner subproblem can be solved by the SsN method efficiently and accurately, which greatly reduces the storage burden and computational costs. In particular, for the nonlinear kernel SVMs, since the sparse structure may not manifest in the early iterations of the AL method, we solve a linear kernel SVM approximated by the random Fourier features method to produce a good initial point, and then transfer to solve the original problem. Numerical experiments demonstrate that the proposed algorithm outperforms the current state-of-the-art solvers for the large-scale SVMs.

On Parallel Learning Based on Support Vector Machines

A Parallel and Scalable Digital Architecture for Training Support Vector Machines

A Novel Svm Modeling Approach For Highly Imbalanced And Overlapping Classification

Support Vector Machines Ensemble With Optimizing Weights By Genetic Algorithm

Fast Training Support Vector Machines Using Parallel Sequential Minimal Optimization

Parallelizing Support Vector Machines on Distributed Computers

Parallel Computing of Support Vector Machines

A Fast Training Method for OC-SVM Based on the Random Sampling Lemma

Parallel Proximal Support Vector Machine for High-Dimensional Pattern Classification

Fast Parallel SVM using Data Augmentation

Parallel and Distributed Structured SVM Training

Parallel Implementation of Gradient-Based Neural Networks for Svm Training

Parallel and Sequential Support Vector Machines for Multi-label Classification

Fast multi-view twin hypersphere support vector machine with consensus and complementary principles

A Multi-Core Parallel Fusion Algorithm for Remote-Sensing Image

Distributed Online Semi-Supervised Support Vector Machine

An Efficient Algorithm for a Class of Large-Scale Support Vector Machines Exploiting Hidden Sparsity.

Parallel Multiclass Support Vector Machine for Remote Sensing Data Classification on Multicore and Many-Core Architectures

A sparse semismooth Newton based augmented Lagrangian method for large-scale support vector machines

Multi-View Scaling Support Vector Machines for Classification and Feature Selection.

Improvement of Support Vector Machine Algorithm in Big Data Background