MTracer: A Trace-Oriented Monitoring Framework for Medium-Scale Distributed Systems

Jingwen Zhou,Zhenbang Chen,Haibo Mi,Ji Wang
DOI: https://doi.org/10.1109/SOSE.2014.37
2014-01-01
Abstract:Trace-oriented runtime monitoring is a very effective method to improve the reliability of distributed systems. However, for medium-scale distributed systems, existing traceoriented monitoring frameworks are either not powerful or efficient enough, or too complex and expensive to deploy and maintain. In this paper, we present MTracer, which is a lightweight trace-oriented monitoring system for mediumscale distributed systems. We have proposed and implemented several optimizations to improve the efficiency of the monitor server in MTracer. A web-based frontend is also provided to visualize a monitored system from different perspectives. We have validated MTracer in a real medium-scale environment. The results indicate that MTracer has a very lower overhead, and can handle more than 4000 events per second.
What problem does this paper attempt to address?