Experience Mining Google's Production Console Logs.

Wei Xu,Ling Huang,Michael I. Jordan
2010-01-01
Abstract:We describe our early experience in applying our console log mining techniques [19, 20] to logs from production Google systems with thousands of nodes. This data set is five orders of magnitude in size and contains almost 20 times as many messages types as the Hadoop data set we used in [19]. It also has many properties that are unique to large scale production deployments (e.g., the system stays on for several months and multiple versions of the software can run concurrently). Our early experience shows that our techniques, including source code based log parsing, state and sequence based feature creation and problem detection, work well on this production data set. We also discuss our experience in using our log parser to assist the log sanitization.
What problem does this paper attempt to address?