LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

Aoxiao Zhong,Dengyao Mo,Guiyang Liu,Jinbu Liu,Qingda Lu,Qi Zhou,Jiesheng Wu,Quanzheng Li,Qingsong Wen

2024-08-25

Abstract:Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose significant challenges to existing automatic parsing techniques. The emergence of Large Language Models (LLM) offers new horizons. With their expansive knowledge and contextual prowess, LLMs have been transformative across diverse applications. Building on this, we introduce LogParser-LLM, a novel log parser integrated with LLM capabilities. This union seamlessly blends semantic insights with statistical nuances, obviating the need for hyper-parameter tuning and labeled training data, while ensuring rapid adaptability through online parsing. Further deepening our exploration, we address the intricate challenge of parsing granularity, proposing a new metric and integrating human interactions to allow users to calibrate granularity to their specific needs. Our method's efficacy is empirically demonstrated through evaluations on the Loghub-2k and the large-scale LogPub benchmark. In evaluations on the LogPub benchmark, involving an average of 3.6 million logs per dataset across 14 datasets, our LogParser-LLM requires only 272.5 LLM invocations on average, achieving a 90.6% F1 score for grouping accuracy and an 81.1% for parsing accuracy. These results demonstrate the method's high efficiency and accuracy, outperforming current state-of-the-art log parsers, including pattern-based, neural network-based, and existing LLM-enhanced approaches.

Software Engineering,Artificial Intelligence

What problem does this paper attempt to address?

The paper aims to address the issues of efficiency and accuracy in the log parsing process. Specifically: - **Improving Log Parsing Efficiency and Accuracy**: The paper introduces a new method called LogParser-LLM, which leverages the capabilities of large language models (LLM) to enhance the efficiency and accuracy of log parsing. By integrating semantic understanding and statistical features, this method achieves rapid adaptability without the need for hyperparameter tuning and labeled training data. - **Reducing LLM Invocation Frequency**: By incorporating a prefix tree structure combined with an LLM template extractor, this method significantly reduces the number of LLM invocations, thereby lowering computational overhead. - **Addressing Parsing Granularity Issues**: The research also explores the granularity issues in log parsing and proposes a new metric called "Granularity Distance" to evaluate the differences between various parsing results. Additionally, by integrating user feedback, it allows users to adjust the parsing granularity according to specific needs. - **Validating Method Effectiveness**: The paper demonstrates through evaluations on the Loghub-2k and large-scale LogPub benchmark datasets that LogParser-LLM improves group accuracy and parsing accuracy by 48.3% and 32.0%, respectively, compared to the current state-of-the-art log parsers. Furthermore, after calibration with a small amount of labeled data, the performance further increases to 56.8% and 69.7%. In summary, this paper aims to improve traditional log parsing methods by introducing LLM technology to achieve a more efficient and accurate log parsing process, and addresses granularity issues by introducing new evaluation metrics and user interaction mechanisms.

LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

High-precision Online Log Parsing with Large Language Models

LibreLog: Accurate and Efficient Unsupervised Log Parsing Using Open-Source Large Language Models

LLM-powered Zero-shot Online Log Parsing

Self-Evolutionary Group-wise Log Parsing Based on Large Language Model

LUNAR: Unsupervised LLM-based Log Parsing

Log Parsing with Self-Generated In-Context Learning and Self-Correction

LILAC: Log Parsing using LLMs with Adaptive Parsing Cache

Demonstration-Free: Towards More Practical Log Parsing with Large Language Models

Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging

A Comparative Study on Large Language Models for Log Parsing

A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?

Stronger, Cheaper and Demonstration-Free Log Parsing with LLMs

ML-Parser: an Efficient and Accurate Online Log Parser

Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge

DLLog: An Online Log Parsing Approach for Large-Scale System

LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis

Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies

Towards robust log parsing using self-supervised learning for system security analysis