Random Undersampling on Imbalance Time Series Data for Anomaly Detection

Mulyana Saripuddin,Azizah Suliman,Sera Syarmila Sameon,Bo Norregaard Jorgensen
DOI: https://doi.org/10.1145/3490725.3490748
2021-09-17
Abstract:Random Undersampling (RUS) is one of resampling approaches to tackle issues with imbalance data by removing instances randomly from the majority class. Anomaly is considered as a rare case, thus the number of instances in the anomaly class is usually much lower than instances in other classes. In anomaly detection of time series data, an anomaly is identified when an unusual pattern exists. Duplicating the unusual pattern may lead to overfitting, which is why this study considered an undersampling method over oversampling approach. This study applied RUS on data with several algorithms to observe its effectiveness on different types of classifier. To prove the overfitting and underfitting issues, different ratios of training and testing were used. Five different evaluation metrics were considered to evaluate the performance of the approach used. It was found that RUS could improve the classification performance of every classifier and the best result was shown when RUS was applied on a deep learning algorithm.
What problem does this paper attempt to address?