Abstract:Configuration settings are essential for tailoring software behavior to meet specific performance requirements. However, incorrect configurations are widespread, and identifying those that impact system performance is challenging due to the vast number and complexity of possible settings. In this work, we present PerfSense, a lightweight framework that leverages Large Language Models (LLMs) to efficiently identify performance-sensitive configurations with minimal overhead. PerfSense employs LLM agents to simulate interactions between developers and performance engineers using advanced prompting techniques such as prompt chaining and retrieval-augmented generation (RAG). Our evaluation of seven open-source Java systems demonstrates that PerfSense achieves an average accuracy of 64.77% in classifying performance-sensitive configurations, outperforming both our LLM baseline (50.36%) and the previous state-of-the-art method (61.75%). Notably, our prompt chaining technique improves recall by 10% to 30% while maintaining similar precision levels. Additionally, a manual analysis of 362 misclassifications reveals common issues, including LLMs' misunderstandings of requirements (26.8%). In summary, PerfSense significantly reduces manual effort in classifying performance-sensitive configurations and offers valuable insights for future LLM-based code analysis research.

What problem does this paper attempt to address?

### The Problem Addressed by the Paper This paper aims to address the issue of identifying performance-sensitive configurations in software systems. Specifically, it attempts to efficiently identify configuration settings that significantly impact system performance through code analysis. #### Background and Challenges - **Importance of Configuration**: Software systems typically contain a large number of configuration options that can be customized based on specific needs. However, incorrect configuration settings are very common, and due to the vast and complex number of possible configurations, identifying which configurations affect system performance is a challenge. - **Performance-Sensitive Configurations**: Changes in certain configurations directly impact system performance. For example, memory allocation settings can significantly affect performance, whereas the application name is usually not performance-sensitive. - **Limitations of Existing Methods**: Traditional performance testing methods are time-consuming and require a lot of manual work. Additionally, performance engineers often need the assistance of developers to understand the complex interactions between the codebase and its components when analyzing the impact of configurations. #### Proposed Method The paper proposes a lightweight framework named PerfSense, which leverages large-scale language models (LLMs) to identify performance-sensitive configurations. PerfSense employs two main agents: - **DevAgent**: Responsible for retrieving source code and documentation related to configurations and conducting performance-aware code reviews. - **PerfAgent**: Utilizes the information provided by DevAgent to classify whether a configuration is performance-sensitive. #### Main Contributions - **Accuracy Improvement**: Through the evaluation of seven open-source Java systems, PerfSense achieved an average accuracy of 64.77%, outperforming existing methods (61.75%) and the LLM baseline (50.36%). - **Prompt Chaining Technique**: By using prompt chaining techniques, PerfSense improved recall rates (10% to 30% improvement) while maintaining similar levels of precision. - **Manual Analysis**: A manual analysis of 362 misclassified configurations revealed some common issues, including LLMs' misunderstanding of requirements (26.8%). In summary, PerfSense provides an efficient method for identifying performance-sensitive configurations through multi-agent collaboration and advanced prompting techniques, reducing the manual workload for developers and offering valuable insights for future LLM-assisted code analysis research.

Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents

Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation

Performance-Aligned LLMs for Generating Fast Code

LLM-Based Misconfiguration Detection for AWS Serverless Computing

Instruct or Interact? Exploring and Eliciting LLMs' Capability in Code Snippet Adaptation Through Prompt Engineering

LMs: Understanding Code Syntax and Semantics for Code Analysis

Plug-and-Play Performance Estimation for LLM Services without Relying on Labeled Data

AI-powered Code Review with LLMs: Early Results

Rule-based automatic software performance diagnosis and improvement

CM-CASL: Comparison-based performance modeling of software systems via collaborative active and semisupervised learning

Automated Reasoning and Detection of Specious Configuration in Large Systems with Symbolic Execution

Agentless: Demystifying LLM-based Software Engineering Agents

Enhancing Fault Localization Through Ordered Code Analysis with LLM Agents and Self-Reflection

Optimizing Large Language Models for Dynamic Constraints through Human-in-the-Loop Discriminators

Using Bad Learners to find Good Configurations

An Insight into Security Code Review with LLMs: Capabilities, Obstacles and Influential Factors

S3LLM: Large-Scale Scientific Software Understanding with LLMs using Source, Metadata, and Document

Search-Based LLMs for Code Optimization

Identifying Inaccurate Descriptions in LLM-generated Code Comments via Test Execution

Automatically Inspecting Thousands of Static Bug Warnings with Large Language Model: How Far Are We?