Tele-Knowledge Pre-training for Fault Analysis
Zhuo Chen,Wen Zhang,Yufeng Huang,Mingyang Chen,Yuxia Geng,Hongtao Yu,Zhen Bi,Yichi Zhang,Zhen Yao,Wenting Song,Xinliang Wu,Yi Yang,Mingyi Chen,Zhaoyang Lian,Yingying Li,Lei Cheng,Huajun Chen
DOI: https://doi.org/10.1109/ICDE55515.2023.00265
2023-01-01
Abstract: In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents. To organize this knowledge from experts uniformly, we propose to create a Tele-KG (tele-knowledge graph). Using this valuable data, we further propose a tele-domain language pre-training model TeleBERT and its knowledge-enhanced version, a tele-knowledge re-training model KTeleBERT. which includes effective prompt hints, adaptive numerical data encoding, and two knowledge injection paradigms. Concretely, our proposal includes two stages: first, pre-training TeleBERT on 20 million tele-related corpora, and then re-training it on 1 million causal and machine-related corpora to obtain KTeleBERT. Our evaluation on multiple tasks related to fault analysis in tele-applications, including root-cause analysis, event association prediction, and fault chain tracing, shows that pre-training a language model with tele-domain data is beneficial for downstream tasks. Moreover, the KTeleBERT re-training further improves the performance of task models, highlighting the effectiveness of incorporating diverse tele-knowledge into the model.