Chatting with Logs: An exploratory study on Finetuning LLMs for LogQL

Vishwanath Seshagiri,Siddharth Balyan,Vaastav Anand,Kaustubh Dhole,Ishan Sharma,Avani Wildani,José Cambronero,Andreas Züfle
2024-12-04
Abstract:Logging is a critical function in modern distributed applications, but the lack of standardization in log query languages and formats creates significant challenges. Developers currently must write ad hoc queries in platform-specific languages, requiring expertise in both the query language and application-specific log details -- an impractical expectation given the variety of platforms and volume of logs and applications. While generating these queries with large language models (LLMs) seems intuitive, we show that current LLMs struggle with log-specific query generation due to the lack of exposure to domain-specific knowledge. We propose a novel natural language (NL) interface to address these inconsistencies and aide log query generation, enabling developers to create queries in a target log query language by providing NL inputs. We further introduce ~\textbf{NL2QL}, a manually annotated, real-world dataset of natural language questions paired with corresponding LogQL queries spread across three log formats, to promote the training and evaluation of NL-to-loq query systems. Using NL2QL, we subsequently fine-tune and evaluate several state of the art LLMs, and demonstrate their improved capability to generate accurate LogQL queries. We perform further ablation studies to demonstrate the effect of additional training data, and the transferability across different log formats. In our experiments, we find up to 75\% improvement of finetuned models to generate LogQL queries compared to non finetuned models.
Databases,Artificial Intelligence,Programming Languages
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the standardization and usability issues of Log Query Languages (LogQL) in modern distributed applications. Specifically, developers face the following challenges when writing log queries on different platforms: 1. **Lack of standardization**: Different log platforms use their own specific log query languages, requiring developers to master multiple query languages, and there is almost no syntactic overlap between each language. This makes developers need to relearn a new query language when switching between different platforms. 2. **Complex log formats**: The formats and structures of log data are complex and diverse, and developers need to have a deep understanding of the specific log format to write accurate query statements. For example, when building a Grafana dashboard, developers must be familiar with tag names, tag values, and the exact log line syntax (such as "sshd[" and ": session opened for"), which places high demands on developers. 3. **Steep learning curve**: Existing log query tools usually have a high learning threshold, and only a few technical experts can fully utilize the functions of these tools. New team members often need to spend a great deal of time getting familiar with these tools. 4. **Lack of context information**: In a DevOps environment, the person who writes the query (usually an operator) and the person who generates the log (usually a developer) are often different people, which results in the operator lacking sufficient context information when writing the query. To address these problems, the authors propose an approach based on Natural Language Processing (NLP) to generate accurate log query statements by fine - tuning Large Language Models (LLMs). Specifically, the authors created a manually - annotated dataset named NL2LogQL, which contains natural language questions and their corresponding LogQL queries. Using this dataset, the authors fine - tuned several state - of - the - art LLMs to improve their ability to generate accurate LogQL queries. Through this method, the authors hope to achieve the following goals: - Provide an intuitive and easy - to - use interface, allowing developers to generate the required LogQL queries through natural language input. - Lower the learning threshold for log queries, enabling more people to easily perform log analysis. - Improve the accuracy and efficiency of queries, reduce syntax errors, and improve tag matching and time aggregation. In summary, this paper aims to solve the standardization and usability problems in log queries by fine - tuning LLMs, thereby improving the efficiency and accessibility of log analysis.