Variance Clustering Based Outlier Identification Algorithm for Time Series Data

SHI Yi,ZHAO Jing,BAO Jun-peng,QI Yong,LIN Qin-ying
2012-01-01
Journal of Computer Applications
Abstract:Outliers in time series data will directly affect the results in data mining,even make the algorithm inefficacious.Traditional Density-Based Spatial Clustering of Applications with Noise(DBSCAN) algorithm can be used in outlier identification;however,there are several deficiencies such as sensitive to parameters,higher time complexity and less accuracy.Considering the characteristics of time series data,an outlier identification algorithm based on variance clustering was proposed.By converting neighborhood density into variance and mean value,converting density threshold into variance and threshold of a time window,based on the definition of outlier data,outlier cluster data and abnormal data,the outlier identification rules were given.For applying the algorithm once will probably not eliminate all the outliers,it is expanded to a multiple identification algorithm by defining the termination condition.This algorithm was verified its generality,less time complexity and higher accurate by being applied to a space data mining system.
What problem does this paper attempt to address?