Abstract:Code generation aims to generate source code implementing human requirements illustrated with natural language specifications. With the rapid development of intelligent software engineering, automated code generation has become a hot research topic in both artificial intelligence and software engineering, and researchers have made significant achievements on code generation. More recently, large language models (LLMs) have demonstrated outstanding performance on code generation tasks, such as ChatGPT released by OpenAI presents the fantastic potential on automated code generation. However, the existing studies are limited to exploring LLMs' ability for generating code snippets to solve simple programming problems, the task of competition-level code generation has never been investigated. The specifications of the programming competition are always complicated and require the specific input/output format as well as the high-level algorithmic reasoning ability. In this study, we conduct the first large empirical study to investigate the zero-shot learning ability of ChatGPT for solving competition programming problems. Specifically, we warm up the design of prompts by using the Human-Eval dataset. Then, we apply the well-designed prompt to the competition-level code generation dataset, namely APPS, to further explore the effectiveness of using ChatGPT for solving competition problems. We collect ChatGPT's outputs on 5,000 code competition problems, the evaluation results show that it can successfully pass 25.4% test cases. By further feeding extra information (e.g, test failed information) to ChatGPT, we observe that ChatGPT has the potential to fix partial pass into a fully pass program. Moreover, we investigate the solutions generated by LLMs and the existing solutions, we find that it prefers to directly copy the code instead of re-write when facing more difficult problems. Finally, we evaluate the code quality generated by ChatGPT in terms of “code cleanness”, we observe that the generated codes are with small functions and file sizes, which are in line with the standard of clean code.

You Augment Me: Exploring ChatGPT-based Data Augmentation for Semantic Code Search

Exploring Representation-Level Augmentation for Code Search

AugGPT: Leveraging ChatGPT for Text Data Augmentation

Enhancing Code Intelligence Tasks with ChatGPT

PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation

ChatGPT Code Detection: Techniques for Uncovering the Source of Code

GenCode: A Generic Data Augmentation Framework for Boosting Deep Learning-Based Code Understanding

ChatGPT as Data Augmentation for Compositional Generalization: A Case Study in Open Intent Detection

Automatic Code Summarization via ChatGPT: How Far Are We?

ZeroShotDataAug: Generating and Augmenting Training Data with ChatGPT

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Exploring ChatGPT-based Augmentation Strategies for Contrastive Aspect-based Sentiment Analysis

Code semantic enrichment for deep code search

Tackling Long Code Search with Splitting, Encoding, and Aggregating

ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract)

Generation-Augmented Query Expansion For Code Retrieval

Improving Text Classification with Large Language Model-Based Data Augmentation

Self-collaboration Code Generation via ChatGPT

Improving ChatGPT Prompt for Code Generation

A Closer Look at Different Difficulty Levels Code Generation Abilities of ChatGPT.

Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study