Fast Support Vector Data Description Training Using Edge Detection on Large Datasets.

Chenlong Hu,Bo Zhou,Jinglu Hu
DOI: https://doi.org/10.1109/ijcnn.2014.6889718
2014-01-01
Abstract:Support Vector Data Description (SVDD) inherits properties of Support Vector Machines (SVM) and has become a prominent One Class Classifier (OCC). Same to standard SVM, its O (n3) time and O (n2) space complexities, where n is the number of training samples, have become major limitations in cases of large training datasets. As a simple and effective method, reducing the size of training dataset through reserving only samples mostly relevant to learned classifier, can be adopted to overcome the limitations. A trained SVDD enclosed decision boundary always locates on edge area of data distribution and is decided by a small subset of Support Vectors(SVs). Therefore, in this paper, we present a method based on edge detection such that edge samples mostly relevant to decision boundary can be preserved. And clustering techniques are also be applied to keep centroids representing the global distribution properties so as to avoid over-outside of decision boundary. To restrict the influences of noises, each training pattern is assigned with a weight. Experiments on real and artificial data sets prove that the classifier trained on reconstruction training set consisting of edge points and centroids can preserve performance with much faster training speed.
What problem does this paper attempt to address?