Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents

Zehao Wang,Dong Jae Kim,Tse-Hsun Chen
2024-06-19
Abstract:Configuration settings are essential for tailoring software behavior to meet specific performance requirements. However, incorrect configurations are widespread, and identifying those that impact system performance is challenging due to the vast number and complexity of possible settings. In this work, we present PerfSense, a lightweight framework that leverages Large Language Models (LLMs) to efficiently identify performance-sensitive configurations with minimal overhead. PerfSense employs LLM agents to simulate interactions between developers and performance engineers using advanced prompting techniques such as prompt chaining and retrieval-augmented generation (RAG). Our evaluation of seven open-source Java systems demonstrates that PerfSense achieves an average accuracy of 64.77% in classifying performance-sensitive configurations, outperforming both our LLM baseline (50.36%) and the previous state-of-the-art method (61.75%). Notably, our prompt chaining technique improves recall by 10% to 30% while maintaining similar precision levels. Additionally, a manual analysis of 362 misclassifications reveals common issues, including LLMs' misunderstandings of requirements (26.8%). In summary, PerfSense significantly reduces manual effort in classifying performance-sensitive configurations and offers valuable insights for future LLM-based code analysis research.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
### The Problem Addressed by the Paper This paper aims to address the issue of identifying performance-sensitive configurations in software systems. Specifically, it attempts to efficiently identify configuration settings that significantly impact system performance through code analysis. #### Background and Challenges - **Importance of Configuration**: Software systems typically contain a large number of configuration options that can be customized based on specific needs. However, incorrect configuration settings are very common, and due to the vast and complex number of possible configurations, identifying which configurations affect system performance is a challenge. - **Performance-Sensitive Configurations**: Changes in certain configurations directly impact system performance. For example, memory allocation settings can significantly affect performance, whereas the application name is usually not performance-sensitive. - **Limitations of Existing Methods**: Traditional performance testing methods are time-consuming and require a lot of manual work. Additionally, performance engineers often need the assistance of developers to understand the complex interactions between the codebase and its components when analyzing the impact of configurations. #### Proposed Method The paper proposes a lightweight framework named PerfSense, which leverages large-scale language models (LLMs) to identify performance-sensitive configurations. PerfSense employs two main agents: - **DevAgent**: Responsible for retrieving source code and documentation related to configurations and conducting performance-aware code reviews. - **PerfAgent**: Utilizes the information provided by DevAgent to classify whether a configuration is performance-sensitive. #### Main Contributions - **Accuracy Improvement**: Through the evaluation of seven open-source Java systems, PerfSense achieved an average accuracy of 64.77%, outperforming existing methods (61.75%) and the LLM baseline (50.36%). - **Prompt Chaining Technique**: By using prompt chaining techniques, PerfSense improved recall rates (10% to 30% improvement) while maintaining similar levels of precision. - **Manual Analysis**: A manual analysis of 362 misclassified configurations revealed some common issues, including LLMs' misunderstanding of requirements (26.8%). In summary, PerfSense provides an efficient method for identifying performance-sensitive configurations through multi-agent collaboration and advanced prompting techniques, reducing the manual workload for developers and offering valuable insights for future LLM-assisted code analysis research.