VECROsim: A Versatile Metric-oriented Microservice Fault Simulation System (tools and Artifact Track)

Tingzhu Bi,Yicheng Pan,Xinrui Jiang,Meng Ma,Ping Wang
DOI: https://doi.org/10.1109/issre55969.2022.00037
2022-01-01
Abstract:Automated fault diagnosis of microservice systems has been a hot topic in recent years. As most incidents in real commercial cloud systems are not publicly available, we have witnessed researchers putting considerable effort into developing various experimental systems. However, previous tools cannot quickly refactor their functionality, scale the architecture, and customize fault characteristics. Given this, we develop VECROsim, a versatile metric-oriented microservice fault simulation system, and release the VECROsim benchmark dataset. VECROsim works delicately as a highly-customizable toolkit to generate abnormal performance metrics datasets of microservice systems on demand and automatically. Validation of representative services from the benchmark dataset confirms the capability of VECROsim to generate realistic performance metrics for diverse real-world systems. Our case studies on root cause analysis and dynamic correlation discovery demonstrated the superiority of VECROsim. We also witnessed that the VECROsim dataset brings new research challenges to state-of-the-art fault diagnosis schemes. VECROsim concretely supports microservice developers from the industry, as well as academic researchers working on fault diagnosis or broader research topics in many ways.
What problem does this paper attempt to address?