Error Log Clustering of Internet Software

Shi-wen CHENG,Dan PEI,Chang-jin WANG
DOI: https://doi.org/10.3969/j.issn.1000-1220.2018.05.001
2018-01-01
Abstract:In the process of ICPs′ actual operations,the service business maintained by operations team often encounter a variety of problems.Thus one critical goal of troubleshooting is to cluster the large amounts of error log and give the feedback to the developer. To address the challenge of sheer amount of non-standard error logs,a method of error log clustering of Internet software is proposed. This method reduces the log scale by extracting log template and compression,improves the clustering accuracy and reduces the data dimension by calculating document frequency to extract feature words,and improves the clustering effect using Canopy clustering and K-means clustering.Experimental results in an Internet company′s operations show that the proposed method not only has an ideal clustering effect,but also meets the performance requirements in the production environment.
What problem does this paper attempt to address?