PerfDoc: Automatic Performance Bug Diagnosis in Production Cloud Computing Infrastructures

Zilong Wen,Weiqi Dai,Deqing Zou,Hai Jin
DOI: https://doi.org/10.1109/trustcom.2016.0126
2016-01-01
Abstract:Performance bugs are difficult to diagnose in production cloud computing environments. Because the performance bugs often only appear under specific conditions are hard to be reproduced and provide little diagnostic information. In this paper, we propose PerfDoc, an automatic performance bugs online diagnosing tool. PerfDoc helps developers analyze software performance anomalies and identify the causes of performance anomalies. PerfDoc leverages system calls produced by the softwares to model the system call behavior of the software based on the self-organizing map, and identifies anomalous system calls with the behavior model by comparing the difference between the different execution units. PerfDoc does not require source code or any runtime instrumentation to the software. We have implemented a prototype of PerfDoc and tested it using real-world performance bugs on five popular open source server systems (Apache, Tomcat, Hadoop, HDFS, and MySQL). The results show that PerfDoc is able to locate the most suspicious system calls that are related to the performance bugs. Moreover, PerfDoc only imposes trivial runtime overhead (2.2%, averagely) on the tested softwares.
What problem does this paper attempt to address?