Abstract:The support vector machines (SVM) is a powerful classifier used for binary classification to improve the prediction accuracy. However, the non-differentiability of the SVM hinge loss function can lead to computational difficulties in high dimensional settings. To overcome this problem, we rely on Bernstein polynomial and propose a new smoothed version of the SVM hinge loss called the Bernstein support vector machine (BernSVM), which is suitable for the high dimension $p >> n$ regime. As the BernSVM objective loss function is of the class $C^2$, we propose two efficient algorithms for computing the solution of the penalized BernSVM. The first algorithm is based on coordinate descent with maximization-majorization (MM) principle and the second one is IRLS-type algorithm (iterative re-weighted least squares). Under standard assumptions, we derive a cone condition and a restricted strong convexity to establish an upper bound for the weighted Lasso BernSVM estimator. Using a local linear approximation, we extend the latter result to penalized BernSVM with non convex penalties SCAD and MCP. Our bound holds with high probability and achieves a rate of order $\sqrt{s\log(p)/n}$, where $s$ is the number of active features. Simulation studies are considered to illustrate the prediction accuracy of BernSVM to its competitors and also to compare the performance of the two algorithms in terms of computational timing and error estimation. The use of the proposed method is illustrated through analysis of three large-scale real data examples.

Estimating the Support of a High-Dimensional Distribution

A Support Detection and Root Finding Approach for Learning High-dimensional Generalized Linear Models

Fuzzy Support Vector Machine For Regression Estimation

Efficient Approximate Algorithms for the Closest Pair Problem in High Dimensional Spaces.

[Contusive ocular trauma--case report].

Estimation of multiple mean vectors in high dimension

On the proliferation of support vectors in high dimensions

Confidence Intervals and Hypothesis Testing for High-Dimensional Regression

Signed Support Recovery for Single Index Models in High-Dimensions

Discovering Support and Affiliated Features from Very High Dimensions

The statistical rate for support matrix machines under low rankness and row (column) sparsity

Estimation and Inference in Ultrahigh Dimensional Partially Linear Single-Index Models

A Constructive Approach to High-dimensional Regression

Nonparametric Estimation via Partial Derivatives

High-dimensional subset recovery in noise: Sparsified measurements without loss of statistical efficiency

Efficient Estimation of the Central Mean Subspace via Smoothed Gradient Outer Products

Bounded Support In Linear Random Coefficient Models: Identification And Variable Selection

Support and distribution inference from noisy data

Doubly Robust Semiparametric Inference Using Regularized Calibrated Estimation with High-dimensional Data

High-Dimensional Penalized Bernstein Support Vector Machines

A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing