Abstract:Abstract Code assistance refers to the utilization of various tools, techniques, and models to help developers in the process of software development. As coding tasks become increasingly complex, code assistant plays a pivotal role in enhancing developer productivity, reducing errors, and facilitating a more efficient coding workflow. This assistance can manifest in various forms, including code autocompletion, error detection and correction, code generation, documentation support, and context-aware suggestions. Language models have emerged as integral components of code assistance, offering developers the capability to receive intelligent suggestions, generate code snippets, and enhance overall coding proficiency. In this paper, we propose new hybrid models for code generation by leveraging pre-trained language models BERT, RoBERTa, ELECTRA, and LUKE with the Marian Causal Language Model. Selecting these models based on their strong performance in various natural language processing tasks. We evaluate the performance of these models on two datasets CoNaLa and DJANGO and compare them to existing state-of-the-art models. We aim to investigate the potential of pre-trained transformer language models to revolutionize code generation, offering improved precision and efficiency in navigating complex coding scenarios. Additionally, conducting error analysis and refining the generated code. Our results show that these models, when combined with the Marian Decoder, significantly improve code generation accuracy and efficiency. Notably, the RoBERTaMarian model achieved a maximum BLEU score of 35.74 and an exact match accuracy of 13.8% on CoNaLa, while LUKE-Marian attained a BLEU score of 89.34 and an exact match accuracy of 78.50% on DJANGO. Implementation of this work is available at https://github.com/AhmedSSoliman/Leveraging-Pretrained-Language-Models-for-Code-Generation .

Enhancing Language Generation with Effective Checkpoints of Pre-trained Language Model.

Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task

Toward Human-Like Evaluation for Natural Language Generation with Error Analysis

A Study on Performance Improvement of Prompt Engineering for Generative AI with a Large Language Model

Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean

Accelerating Multilingual Language Model for Excessively Tokenized Languages

Benchmarking Large Language Model Capabilities for Conditional Generation

Leveraging pre-trained language models for code generation

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning

Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation

Self-Evaluation Improves Selective Generation in Large Language Models

A Retrieval-Augmented Generation Based Large Language Model Benchmarked On a Novel Dataset

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

Prompt Perturbation in Retrieval-Augmented Generation based Large Language Models

Extrapolating Multilingual Understanding Models as Multilingual Generators

Pragmatic Competence Evaluation of Large Language Models for the Korean Language

Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation

Learning to Compare for Better Training and Evaluation of Open Domain Natural Language Generation Models