Research on the construction of log parsing system based on regular expression

Fenfen Wang,Jun Zhang
DOI: https://doi.org/10.1109/ICITBS55627.2022.00137
2022-03-01
Abstract:With the construction of various information systems and IT equipment, the types and contents of logs generated by systems and equipment are becoming more and more complex. The structures of log data generated by different systems and equipment are different, and most of them are unstructured data, which brings difficulties to log analysis and access. In order to solve the problem of heterogeneous log data identification and parsing, this paper proposes a log data grouping extraction method based on regular expression. This method adopts the strategy of separating log format description from log content parsing, and uses XML language to write script documents for log format description and log data item attribute description, As the log parsing configuration document, the script corresponds to the log type and is separated from the specific parser. Taking Apache log as an example, this paper develops a prototype system. The results show that this method not only has high parsing efficiency, but also improves the flexibility and scalability of log parsing.
Computer Science
What problem does this paper attempt to address?