Online Learning Based Internet Service Fault Diagnosis Using Active Probing

Cheng Li,Shihong Zou,Lingwei Chu
DOI: https://doi.org/10.1109/icnsc.2009.4919376
2009-01-01
Abstract:One of the great challenges in Internet service fault management under noisy and uncertain environment lies in the difficulty of fault priori distribution acquisition. To address the problem, an active probing based approach is proposed for the Internet service in this paper. A hidden Markov model(HMM) based dynamic probabilistic dependency model is chosen to be the fault propagation model (FPM). A forward-backward (F-B) learning procedure is employed for the estimation of FPM. F-B fully takes both uncertainty and excessive probing traffic load into account, revising the FPM with active probing and online learning techniques. Detection probes and diagnosis probes were employed separately in fault detection phase and fault diagnosis phase. The selection of diagnosis probes is integrated into the online model learning procedure. As for fault diagnosis, a Viterbi N-best based approach is proposed to record N most likely faulty components, utilizing the probing information gain in the F-B learning procedure. As a result it can reduce the complexity of the fault priori distribution acquisition, further enhancing the accuracy of the detection rate. Simulation results prove the validity and efficiency of the HMM-based FPM model and proposed approaches.
What problem does this paper attempt to address?