Support vector machine in big data: smoothing strategy and adaptive distributed inference

Kangning Wang,Jin Liu,Xiaofei Sun
DOI: https://doi.org/10.1007/s11222-024-10506-5
IF: 2.3241
2024-09-28
Statistics and Computing
Abstract:Support vector machine (SVM) is a powerful binary classification tool, but the growing size of modern data is bringing challenges to it. First, the non-smoothness of hinge loss poses difficulties in large-scale computation. Second, the existing large-scale distributed algorithms heavily rely on uniformity and randomness conditions, which are frequently violated in practice. To solve these issues, we first construct a convolution smoothing SVM, which enjoys a smooth and convex objective function. Then a distributed SVM is developed, in which the estimator can be calculated conveniently by minimizing a pilot sample-based distributed surrogate loss. In particular, it can be adaptive when the uniformity or randomness condition is violated. The established theoretical results and numerical experiments on both synthetic and real data all confirm the proposed methods.
statistics & probability,computer science, theory & methods
What problem does this paper attempt to address?