Progress of Code Naturalness and Its Application

CHEN Zhe-Zhe,YAN Meng,XIA Xin,LIU Zhong-Xin,XU Zhou,LEI Yan
2021-01-01
Abstract:The study of code naturalness is one of the common research hotspots in the field of natural language processing and software engineering, aiming to solve various software engineering tasks by building a code naturalness model based on natural language processing techniques. In recent years, as the size of source code and data in the open source software community continues to grow, more and more researchers are focusing on the information contained in the source code, and a series of research results have been achieved. However, at the same time, code naturalness research faces many challenges in code corpus construction, model building, and task application. In view of this, this paper reviews and summarizes the progress of code naturalness research and application in recent years in terms of code corpus construction, model construction and task application. The main contents include: (1) Introducing the basic concept of code naturalness and its research overview. (2) The current corpus of code naturalness research is summarized, and the modeling 收稿时间: 2021-01-29; 修改时间: 2021-03-25, 2021-04-14; 采用时间: 2021-04-27; jos 在线出版时间: 2021-05-20 2 Journal of Software 软件学报 methods for code naturalness are classified and summarized. (3) Summarizes the experimental validation methods and model evaluation metrics of code naturalness models. (4) Summarize and categorize the current application status of code naturalness. (5) Summarize the key issues of code naturalness techniques. (6) Prospects the future development of code naturalness techniques.
What problem does this paper attempt to address?