Terminator: An Efficient and Light-weight Fault Localization Framework

Yuxing Li,Hu Zheng,Chengqiang Huang,Ke Pei,Jinghui Li,Longbo Huang
DOI: https://doi.org/10.1109/INFOCOMWKSHPS50562.2020.9163055
2020-01-01
Abstract:Network fault localization has always been a major challenge for efficient data center operation. In this paper, we propose a faulty link localization framework named “Terminator,” which provides an efficient hierarchical link probing scheme for fault localization. With both local and global probing, Terminator conserves the spine link bandwidth and localizes faulty link with high accuracy. We implement a prototype of Terminator and show that it outperforms existing benchmarks, including 007 [7], TOMO [5], PLL [6], and NetBouncer [2]. In particular, Terminator achieves an average of 37.5% internal problem-solving rate and improves the localization accuracy of 007 to nearly 100% in fat-tree topologies.
What problem does this paper attempt to address?