Abstract:Code generation aims to generate source code implementing human requirements illustrated with natural language specifications. With the rapid development of intelligent software engineering, automated code generation has become a hot research topic in both artificial intelligence and software engineering, and researchers have made significant achievements on code generation. More recently, large language models (LLMs) have demonstrated outstanding performance on code generation tasks, such as ChatGPT released by OpenAI presents the fantastic potential on automated code generation. However, the existing studies are limited to exploring LLMs' ability for generating code snippets to solve simple programming problems, the task of competition-level code generation has never been investigated. The specifications of the programming competition are always complicated and require the specific input/output format as well as the high-level algorithmic reasoning ability. In this study, we conduct the first large empirical study to investigate the zero-shot learning ability of ChatGPT for solving competition programming problems. Specifically, we warm up the design of prompts by using the Human-Eval dataset. Then, we apply the well-designed prompt to the competition-level code generation dataset, namely APPS, to further explore the effectiveness of using ChatGPT for solving competition problems. We collect ChatGPT's outputs on 5,000 code competition problems, the evaluation results show that it can successfully pass 25.4% test cases. By further feeding extra information (e.g, test failed information) to ChatGPT, we observe that ChatGPT has the potential to fix partial pass into a fully pass program. Moreover, we investigate the solutions generated by LLMs and the existing solutions, we find that it prefers to directly copy the code instead of re-write when facing more difficult problems. Finally, we evaluate the code quality generated by ChatGPT in terms of “code cleanness”, we observe that the generated codes are with small functions and file sizes, which are in line with the standard of clean code.

An Empirical Understanding of Code Clone Detection by ChatGPT

Code Clone Detection: A Literature Review

Assessing the Code Clone Detection Capability of Large Language Models

Assessing and Improving Dataset and Evaluation Methodology in Deep Learning for Code Clone Detection

ChatGPT Code Detection: Techniques for Uncovering the Source of Code

Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey

Investigating the Efficacy of Large Language Models for Code Clone Detection

GPTCloneBench: A comprehensive benchmark of semantic clones and cross-language clones using GPT-3 model and SemanticCloneBench

Evaluation of Contrastive Learning with Various Code Representations for Code Clone Detection

On the Use of Deep Learning Models for Semantic Clone Detection

Unveiling the potential of large language models in generating semantic and cross-language clones

Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study

GRRLN: Gated Recurrent Residual Learning Networks for code clone detection

A Closer Look at Different Difficulty Levels Code Generation Abilities of ChatGPT.

SCCD-GAN: An Enhanced Semantic Code Clone Detection Model Using GAN

Detecting Code Clones with Graph Neural Networkand Flow-Augmented Abstract Syntax Tree

Neural Detection of Semantic Code Clones Via Tree-Based Convolution

A Machine Learning Based Framework for Code Clone Validation

Goner: Building Tree-Based N-Gram-Like Model for Semantic Code Clone Detection

Discriminating Human-authored from ChatGPT-Generated Code Via Discernable Feature Analysis

Go-clone: Graph-Embedding Based Clone Detector for Golang