Multi-Programming Language Ensemble for Code Generation in Large Language Model

Tengfei Xue,Xuefeng Li,Tahir Azim,Roman Smirnov,Jianhui Yu,Arash Sadrieh,Babak Pahlavan

2024-09-06

Abstract:Large language models (LLMs) have significantly improved code generation, particularly in one-pass code generation. However, most existing approaches focus solely on generating code in a single programming language, overlooking the potential of leveraging the multi-language capabilities of LLMs. LLMs have varying patterns of errors across different languages, suggesting that a more robust approach could be developed by leveraging these multi-language outputs. In this study, we propose Multi-Programming Language Ensemble (MPLE), a novel ensemble-based method that utilizes code generation across multiple programming languages to enhance overall performance. By treating each language-specific code generation process as an individual "weak expert" and effectively integrating their outputs, our method mitigates language-specific errors and biases. This multi-language ensemble strategy leverages the complementary strengths of different programming languages, enabling the model to produce more accurate and robust code. Our approach can be seamlessly integrated with commonly used techniques such as the reflection algorithm and Monte Carlo tree search to improve code generation quality further. Experimental results show that our framework consistently enhances baseline performance by up to 17.92% on existing benchmarks (HumanEval and HumanEval-plus), with a standout result of 96.25% accuracy on the HumanEval benchmark, achieving new state-of-the-art results across various LLM models. The code will be released at <a class="link-external link-https" href="https://github.com/NinjaTech-AI/MPLE" rel="external noopener nofollow">this https URL</a>

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: Existing large - language models (LLMs) mainly focus on a single programming language in code - generation tasks, ignoring the potential advantages of multilingual capabilities. LLMs in different programming languages exhibit different error patterns, which indicates that by leveraging these multilingual outputs, more powerful methods can be developed to improve the overall performance of code generation. Specifically, the paper proposes a novel integration method named Multi - Programming Language Ensemble (MPLE), which enhances the overall performance by generating code across multiple programming languages. By regarding the code - generation process of each specific language as a "weak expert" and effectively integrating their outputs, this method mitigates language - specific errors and biases, thereby generating more accurate and robust code. The main contributions of the paper include: 1. Proposing a multilingual integration framework for code generation of LLMs, improving the robustness and accuracy of code by leveraging the advantages of different programming languages. 2. Demonstrating how to seamlessly integrate this framework with existing techniques (such as reflection algorithms and Monte Carlo tree search) to further enhance code quality. 3. Verifying the effectiveness of this method through extensive experiments on the HumanEval and HumanEval - plus datasets, achieving new state - of - the - art results with a performance improvement of up to 17.92%.

Multi-Programming Language Ensemble for Code Generation in Large Language Model

MultiPL-E: A Scalable and Polyglot Approach to Benchmarking Neural Code Generation

MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

Planning-Driven Programming: A Large Language Model Programming Workflow

Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency

Evaluating Large Language Models in Class-Level Code Generation

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

McEval: Massively Multilingual Code Evaluation

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models

Improving Natural Language Capability of Code Large Language Model

Enhancing Program Synthesis with Large Language Models Using Many-Objective Grammar-Guided Genetic Programming

Enabling Programming Thinking in Large Language Models Toward Code Generation

From Code to Play: Benchmarking Program Search for Games Using Large Language Models

CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

Exploring Multi-Lingual Bias of Large Code Models in Code Generation

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks

Self-planning Code Generation with Large Language Models

A Pair Programming Framework for Code Generation Via Multi-Plan Exploration and Feedback-Driven Refinement

Effi-Code: Unleashing Code Efficiency in Language Models