Abstract:To release developers from time-consuming software development, many approaches have been proposed to generate source code automatically according to software requirements. With significant advances in deep learning and natural language processing, deep learning-based approaches are proposed to generate source code from natural language descriptions. The key insight is that given a large corpus of software requirements and their corresponding implementations, advanced deep learning techniques may learn how to translate software requirements into source code that fulfill such requirements. Although such approaches are reported to be highly accurate, they are evaluated on datasets that are rather small, lack of diversity, and significantly different from real-world software requirements. To this end, we build a large scale dataset that is composed of longer requirements as well as validated implementations. We evaluate the state-of-the-art approaches on this new dataset, and the results suggest that their performance on our dataset is significantly lower than that on existing datasets concerning the common metrics, i.e., BLEU. Evaluation results also suggest that the generated programs often contain syntactic and semantical errors, and none of them can pass even a single predefined test case. Further analysis reveals that the state-of-the-art approaches learn little from software requirements, and most of the successfully generated statements are popular statements in the training programs. Based on this finding, we propose a popularity-based approach that always generates the most popular statements in training programs regardless of the input (software requirements). Evaluation results suggest that none of the state-of-the-art approaches can outperform this simple statistics-based approach. As a conclusion, deep learning-based program generation requires significant improvement in the future, and our dataset may serve as a basis for future research in this direction.

Code Generation from Flowcharts with Texts: A Benchmark Dataset and an Approach.

Research and Application of Code Automatic Generation Algorithm Based on Structured Flowchart.

Unstructured Error Check and Automatic Code Generation for Flowchart

Automatic Conversion of Structured Flowcharts into Problem Analysis Diagram for Generation of Codes.

From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing

Deep Learning Based Code Generation from Requirements Text: Are We There Yet?

A Code Automatic Generation Algorithm Based on Structured Flowchart

Deep Learning Based Program Generation from Requirements Text: Are We There Yet?

A Novel Algorithm of Error Check and Code Generation for Structured Flowchart

GAP-Gen: Guided Automatic Python Code Generation

SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents

CFlow: Supporting Semantic Flow Analysis of Students' Code in Programming Problems at Scale

A Bidirectional Generation Method of SmartC Models and Codes

CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks

SkCoder: A Sketch-based Approach for Automatic Code Generation

Code Generation with Hybrid of Structural and Semantic Features Retrieval

Code Generation Approach Supporting Complex System Modeling Based on Graph Pattern Matching

Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation

Procedure2Command: an AI-based Nuclear Power Plant Control Command Code Generation Prototype System

Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

Natural Language to Code: How Far Are We?