An Improved Parallel SVM Algorithm on Distributed System

Zelu Kang,Nong Xiao,Zhiguang Chen,Yang Ou,Xinming Li
DOI: https://doi.org/10.1109/cyberc49757.2020.00040
2020-01-01
Abstract:Support vector machine (SVM), a popular machine learning method, has good generalization performance while solving classification problems and has been applied to a wide range of domains such as medicine, finance, biotechnology for supervised learning. In order to handle large-scale problems, various parallel SVM algorithms have been developed. Existing parallel algorithms usually use the entire dataset for training SVM, however there are very few samples contribute to the final model. The task of this paper is to design an improved parallel SVM algorithm, which can effectively reduce the size of dataset before the training process to accelerate SVM. Meanwhile, in order to maintain the classification accuracy of the final model, we propose a method to recover effective data from the eliminated samples. We implement the proposed algorithm using MPI and compare it with LIBSVM - a popular sequential SVM software - and parallel sequential minimal optimization (PSMO). Experiments show that our approach significantly reduces the computation overhead and maintains the classification accuracy.
What problem does this paper attempt to address?