Development of Software Tools to Improve the Work of the Code Completion Mechanism Using Machine Learning Algorithms in an Integrated Development Environment for Python

Andrey O. Matveev,Alexander V. Bystrov,Vitaly I. Bibaev,Nikita I. Povarov
DOI: https://doi.org/10.25205/1818-7900-2020-18-2-62-75
2020-01-01
Abstract:Auto-completion is an essential feature of any popular text editor for some language. It allows users to avoid the process of annoying typing of long expressions in their projects. There are a lot of different works in this direction in scientific research and commercial products. These works are very different and either use some special features and heuristics to improve code completion or use machine learning techniques. Most of these approaches rely on synthetic data and do not take into account the behavior of real users. The article proposes an approach to improve the automatic code completion mechanism for the Python language by collecting information about usage of this mechanism by real users. The obtained data is used to train the model to rank completion variants with machine learning algorithms. To train the model, two types of features are used: contextual and elemental. Contextual features describe information about the code next to the cursor position in a text editor. Elemental features describe the characteristics of the proposed variant, for example, the length of the matching prefix. When building a model, it is important to take into account the limits of the response time of the model and its size. Also, in the paper, various approaches of assessing the quality of the final model are considered.
What problem does this paper attempt to address?