A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

Qiushi Sun,Zhirui Chen,Fangzhi Xu,Kanzhi Cheng,Chang Ma,Zhangyue Yin,Jianing Wang,Chengcheng Han,Renyu Zhu,Shuai Yuan,Qipeng Guo,Xipeng Qiu,Pengcheng Yin,Xiaoli Li,Fei Yuan,Lingpeng Kong,Xiang Li,Zhiyong Wu

2024-09-01

Abstract:Neural Code Intelligence -- leveraging deep learning to understand, generate, and optimize code -- holds immense potential for transformative impacts on the whole society. Bridging the gap between Natural Language and Programming Language, this domain has drawn significant attention from researchers in both research communities over the past few years. This survey presents a systematic and chronological review of the advancements in code intelligence, encompassing over 50 representative models and their variants, more than 20 categories of tasks, and an extensive coverage of over 680 related works. We follow the historical progression to trace the paradigm shifts across different research phases (e.g., from modeling code with recurrent neural networks to the era of Large Language Models). Concurrently, we highlight the major technical transitions in models, tasks, and evaluations spanning through different stages. For applications, we also observe a co-evolving shift. It spans from initial endeavors to tackling specific scenarios, through exploring a diverse array of tasks during its rapid expansion, to currently focusing on tackling increasingly complex and varied real-world challenges. Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains. Finally, we delve into both the opportunities and challenges associated with this field, alongside elucidating our insights on the most promising research directions. An ongoing, dynamically updated project and resources associated with this survey have been released at <a class="link-external link-https" href="https://github.com/QiushiSun/NCISurvey" rel="external noopener nofollow">this https URL</a>.

Software Engineering,Artificial Intelligence,Computation and Language,Programming Languages

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the review and progress analysis in the field of Neural Code Intelligence. Specifically, the paper aims to: 1. **Systematic Review**: Provide a systematic and chronological review, covering more than 50 representative models and their variants, more than 20 task categories, and over 690 related research works. 2. **Technical Evolution**: Trace the paradigm shifts at different research stages, such as the development from using Recurrent Neural Networks (RNN) to model code to the era of large - language models (LLM). 3. **Technical Transformation**: Emphasize the major technical transformations in models, tasks, and evaluations across different development stages. 4. **Application Evolution**: Observe the application evolution from the initial attempts to solve problems in specific scenarios, to exploring diverse tasks, and then to currently focusing on solving increasingly complex and diverse real - world challenges. 5. **Cross - Domain Fusion**: Explore the emerging synergies between code intelligence and other broader machine intelligence, reveal new cross - domain opportunities, and demonstrate the substantial impact of code intelligence in various fields. 6. **Opportunities and Challenges**: Thoroughly explore the opportunities and challenges in this field and clarify insights into the most promising research directions in the future. Through these objectives, the paper hopes to provide researchers and practitioners with a comprehensive perspective on the latest progress and development trends in Neural Code Intelligence.

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

Survey of Code Search Based on Deep Learning

A Survey on Artificial Intelligence for Source Code: A Dialogue Systems Perspective

Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies

Towards Cognitive AI Systems: a Survey and Prospective on Neuro-Symbolic AI

Coding for Intelligence from the Perspective of Category

Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code

Towards Data-and Knowledge-Driven AI: A Survey on Neuro-Symbolic Computing

Towards Data-and Knowledge-Driven Artificial Intelligence: A Survey on Neuro-Symbolic Computing

A Closer Look into Transformer-Based Code Intelligence Through Code Transformation: Challenges and Opportunities

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks

Understanding Neural Code Intelligence Through Program Simplification

Deep Learning for Code Generation: a Survey

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

Large Language Models Meet NL2Code: A Survey

SciCode: A Research Coding Benchmark Curated by Scientists

A Survey of Deep Learning Models for Structural Code Understanding

Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent