Abstract:The widespread use of Large Language Models (LLMs) in software engineering has intensified the need for improved model and resource efficiency. In particular, for neural code generation, LLMs are used to translate function/method signature and DocString to executable code. DocStrings which capture user re quirements for the code and used as the prompt for LLMs, often contains redundant information. Recent advancements in prompt compression have shown promising results in Natural Language Processing (NLP), but their applicability to code generation remains uncertain. Our empirical study show that the state-of-the-art prompt compression methods achieve only about 10% reduction, as further reductions would cause significant performance degradation. In our study, we propose a novel compression method, ShortenDoc, dedicated to DocString compression for code generation. Our extensive experiments on six code generation datasets, five open-source LLMs (1B to 10B parameters), and one closed-source LLM GPT-4o confirm that ShortenDoc achieves 25-40% compression while preserving the quality of generated code, outperforming other baseline methods at similar compression levels. The benefit of this research is to improve efficiency and reduce the cost while maintaining the quality of the generated code, especially when calling third-party APIs, and is able to reduce the token processing cost by 25-40%.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the code - generation task, when using large - language models (LLMs), the waste of computing resources and inefficiency caused by redundant information in DocString. ### Problem Background 1. **Importance of DocString**: - DocString is an important part of the code, usually located at the beginning of a function or method, and is used to describe the function, parameters, return values, and possible exceptions of the code. It not only improves the readability and maintainability of the code, but also serves as a prompt to guide LLMs to generate code that meets the requirements. 2. **Existing Problems**: - Existing DocStrings may contain a large amount of redundant information. This redundant information increases the computing cost and reduces the model - inference efficiency. Especially when calling third - party APIs, overly long prompts will also increase the financial cost. - Current prompt - compression techniques (such as Selective_Context and LLMLingua2) have limited effectiveness when applied to DocString compression. When the compression rate exceeds 10%, the quality of the generated code will decline significantly. ### Paper Objectives To solve the above problems, the paper proposes a new compression method - ShortenDoc, aiming to optimize the quality of DocString, improve the efficiency of the code - generation task, and reduce costs. Specifically: - **Compression Effectiveness**: Existing compression methods perform poorly on DocString, and further compression will lead to a decline in the quality of the generated code. The paper proves this through experiments and proposes a new compression method. - **Flexibility**: Existing methods require manual setting of the compression ratio, which is difficult to adapt to different code - generation scenarios. The method proposed in the paper can dynamically adjust the compression according to the importance of each token without manual setting of the compression ratio. - **Retention of Key Information**: The new method ensures that key information is retained during the compression process, avoiding information loss due to over - compression, which would affect the quality of code generation. ### Main Contributions 1. **Feasibility and Limitation Analysis**: Through experiments, the feasibility and limitations of existing DocString compression methods in the code - generation task are demonstrated. 2. **Proposing a New Method, ShortenDoc**: An adaptive DocString compression method is designed, which has a better compression effect compared to existing methods. 3. **In - depth Insights**: Insights into DocString compression techniques are explored and relevant insights are provided. ### Conclusion The paper verifies the superior performance of ShortenDoc through extensive experiments. It can achieve a compression rate of 25% - 40% while maintaining the quality of the generated code, thereby improving the efficiency of the code - generation task and reducing costs.

Less is More: DocString Compression in Code Generation

When to Stop? Towards Efficient Code Generation in LLMs with Excess Token Prevention

LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

LanguaShrink: Reducing Token Overhead with Psycholinguistics

Code Less, Align More: Efficient LLM Fine-tuning for Code Generation with Data Pruning

Natural Is The Best: Model-Agnostic Code Simplification for Pre-trained Large Language Models

Brevity is the soul of wit: Pruning long files for code generation

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

500xCompressor: Generalized Prompt Compression for Large Language Models

AceCoder : An Effective Prompting Technique Specialized in Code Generation

Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression

Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles

Prompt Compression for Large Language Models: A Survey

Parse Trees Guided LLM Prompt Compression

Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study

Extending Context Window of Large Language Models via Semantic Compression

Anchor Attention, Small Cache: Code Generation with Large Language Models

Semantic Compression With Large Language Models

Effi-Code: Unleashing Code Efficiency in Language Models

Improving Natural Language Capability of Code Large Language Model