A PCA-Based Traffic Monitoring Approach for Distributed Computing Systems

Li Zhao,Ge Fu,Qian Liu,Xinran Liu,Wei Cao
DOI: https://doi.org/10.1109/sose.2014.38
2014-01-01
Abstract:Monitoring traffics between applications deployed in a distributed computing system (DCS) can help analyzers perceive the dynamic load of each application, and detect the anomalies in all the running processes. However, due to the factors of high dimension and strong periodicity, the traffic data is difficult to visualize and interpret. In this paper, we propose a traffic monitoring approach based on Principal Component Analysis (PCA) which is a classical dimension-reduction tool. We find that the first PC represents the overall scale of the traffic while the second PC reflects all nontrivial variations caused by different applications. Then we locate the exact alteration time and identify the very changing applications by a semi-Bayes algorithm on the second PC. We further perform online anomaly detection on new traffics utilizing the previously classified data. Experiments on datasets collected from several distributed computing systems including 44 applications show the proposed approach can effectively facilitate DSC traffic monitoring, and outperforms Kmeans and DBSCAN in identifying different system states.
What problem does this paper attempt to address?