Approximate Approach to Train SVM on Very Large Data Sets

曾志强,廖备水,高济
DOI: https://doi.org/10.3969/j.issn.1002-137x.2009.11.051
2009-01-01
Computer Science
Abstract:Standard Support Vector Machine(SVM) training has O(l3) time and O(l2) space complexities,where l is the training set size.It is thus computationally infeasible on very large data sets.A novel SVM training method,Approximate Vector Machine(AVM),based on approximate solution was presented to scale up kernel methods on very large data sets.This approach only obtains an approximately optimal hyper plane by incremental learning,and uses probabilistic speedup and hot start tricks to accelerate training speed during each iterative stage.Theoretical analysis indicates that AVM has the time and space complexities that are independent of training set size.Experiments on very large data sets show that the proposed method not only preserves the generalization performance of the original SVM classifiers,but outperforms existing scale-up methods in terms of training time and number of support vectors.
What problem does this paper attempt to address?