LabelEase: A Semi-Automatic Tool for Efficient and Accurate Trace Labeling in Microservices

Shenglin Zhang,Zeyu Che,Zhongjie Pan,Xiaohui Nie,Yongqian Sun,Lemeng Pan,Dan Pei
DOI: https://doi.org/10.1109/issre62328.2024.00032
2024-01-01
Abstract:Trace data is crucial for system observability and maintainability within microservices architectures, and many operation algorithms depend heavily on trace data, including anomaly detection, root cause analysis, etc. However, the actual performance of these algorithms might be unsatisfactory due to the absence of high-quality labeled datasets for effective training and evaluation. Since billions of traces could be generated daily for large-scale microservices, labeling overhead is the main hurdle to obtaining high-quality trace datasets.In this paper, we propose LabelEase, a novel semi-automatic trace labeling tool, which uses active learning techniques to achieve efficient and accurate trace labeling. For anomaly trace labeling, LabelEase clusters similar traces with a graph-based trace representation technique and selects a few representative traces for human labeling, avoiding labeling most of the traces. For root cause labeling, LabelEase aggregates the labeled anomalous traces and identifies the service’s failures for operators to label. Our systematic experiments on two large-scale datasets show that LabelEase achieves over 0.98 F 1 -score in anomaly trace labeling and 0.89 precision of failure detection in root cause labeling, LabelEase can reduce operators’ labeling overhead by more than 99.9%. To the best of our knowledge, we are the first to propose a semi-automatic trace labeling tool capable of achieving efficient and accurate trace labeling.
What problem does this paper attempt to address?