Abstract:The code written by developers usually suffers from efficiency problems and contain various performance bugs. These inefficiencies necessitate the research of automated refactoring methods for code optimization. Early research in code optimization employs rule-based methods and focuses on specific inefficiency issues, which are labor-intensive and suffer from the low coverage issue. Recent work regards the task as a sequence generation problem, and resorts to deep learning (DL) techniques such as large language models (LLMs). These methods typically prompt LLMs to directly generate optimized code. Although these methods show state-of-the-art performance, such one-step generation paradigm is hard to achieve an optimal solution. First, complex optimization methods such as combinatorial ones are hard to be captured by LLMs. Second, the one-step generation paradigm poses challenge in precisely infusing the knowledge required for effective code optimization within LLMs, resulting in under-optimized <a class="link-external link-http" href="http://code.To" rel="external noopener nofollow">this http URL</a> address these problems, we propose to model this task from the search perspective, and propose a search-based LLMs framework named SBLLM that enables iterative refinement and discovery of improved optimization methods. SBLLM synergistically integrate LLMs with evolutionary search and consists of three key components: 1) an execution-based representative sample selection part that evaluates the fitness of each existing optimized code and prioritizes promising ones to pilot the generation of improved code; 2) an adaptive optimization pattern retrieval part that infuses targeted optimization patterns into the model for guiding LLMs towards rectifying and progressively enhancing their optimization methods; and 3) a genetic operator-inspired chain-of-thought prompting part that aids LLMs in combining different optimization methods and generating improved optimization methods.

What problem does this paper attempt to address?

The paper aims to address efficiency issues and performance errors in code optimization. Traditional methods rely on rule-based approaches to solve specific types of efficiency problems, but these methods are labor-intensive and have limited coverage. In recent years, there has been a growing body of research on using Large Language Models (LLMs) for code optimization. While these methods perform well in certain aspects, they struggle to capture complex optimization techniques due to the limitations of the one-shot generation paradigm. Additionally, it is challenging to precisely integrate the knowledge required for effective code optimization into LLMs, leading to suboptimal results. To address these issues, the paper proposes a new framework, SBLLM (Search-Based Large Language Models), which models the code optimization task from a search perspective. SBLLM combines LLMs with evolutionary search strategies and includes three main components: 1. **Execution-based Representative Sample Selection**: By evaluating the effectiveness of existing optimized code and prioritizing samples with efficient and unique optimization methods, it guides further optimization. 2. **Adaptive Optimization Pattern Retrieval**: An adaptive retrieval mechanism is proposed to inject domain knowledge into LLMs, guiding them to correct and gradually improve their optimization methods. 3. **Genetic Operator-Inspired Chain-of-Thought Prompting**: A Chain-of-Thought (COT) prompting method is introduced, utilizing crossover and mutation operations to assist LLMs in developing improved optimized code. Experimental results show that SBLLM significantly outperforms baseline methods in improving the efficiency of Python and C++ code. Specifically, program execution efficiency increased by up to 209.59%, and speedup rates improved by 8.75%~28.06% and 1.15%~9.56% under different LLMs compared to baseline methods. This demonstrates the effectiveness of SBLLM in enhancing code efficiency.

Search-Based LLMs for Code Optimization

Optimizing Search-Based Unit Test Generation with Large Language Models: an Empirical Study

Deep Insights into Automated Optimization with Large Language Models and Evolutionary Algorithms

Code Optimization Chain-of-Thought: Structured Understanding and Self-Checking

On the Design and Analysis of LLM-Based Algorithms

LLM-based Optimization of Compound AI Systems: A Survey

A Survey on Large Language Models for Code Generation

LLMs are Highly-Constrained Biophysical Sequence Optimizers

Effi-Code: Unleashing Code Efficiency in Language Models

How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark

Performance-Aligned LLMs for Generating Fast Code

Scattered Forest Search: Smarter Code Space Exploration with LLMs

Unseen Horizons: Unveiling the Real Capability of LLM Code Generation Beyond the Familiar

Large Language Models as Evolutionary Optimizers

On Evaluating the Efficiency of Source Code Generated by LLMs

Code Repair with LLMs gives an Exploration-Exploitation Tradeoff

Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems

Autonomous Multi-Objective Optimization Using Large Language Model

LLaMEA: A Large Language Model Evolutionary Algorithm for Automatically Generating Metaheuristics

Rethinking Code Refinement: Learning to Judge Code Efficiency

Large Language Model for Multi-objective Evolutionary Optimization