Trace-based Multi-Dimensional Root Cause Localization of Performance Issues in Microservice Systems

Chenxi Zhang,Zhen Dong,Xin Peng,Bicheng Zhang,Miao Chen
DOI: https://doi.org/10.1145/3597503.3639088
2024-01-01
Abstract:Modern microservice systems have become increasingly complicated due to the dynamic and complex interactions and runtime environment. It leads to the system vulnerable to performance issues caused by a variety of reasons, such as the runtime environments, communications, coordinations, or implementations of services. Traces record the detailed execution process of a request through the system and have been widely used in performance issues diagnosis in microservice systems. By identifying the execution processes and attribute value combinations that are common in anomalous traces but rare in normal traces, engineers may localize the root cause of a performance issue into a smaller scope. However, due to the complex structure of traces and the large number of attribute combinations, it is challenging to find the root cause from the huge search space. In this paper, we propose TraceContrast, a trace-based multidimensional root cause localization approach. TraceContrast uses a sequence representation to describe the complex structure of a trace with attributes of each span. Based on the representation, it combines contrast sequential pattern mining and spectrum analysis to localize multidimensional root causes efficiently. Experimental studies on a widely used microservice benchmark show that TraceContrast outperforms existing approaches in both multidimensional and instance-dimensional root cause localization with significant accuracy advantages. Moreover, Trace-Contrast is efficient and its efficiency can be further improved by parallel execution.
What problem does this paper attempt to address?