Abstract:Code completion is an important feature in Integrated Development Environments (IDEs). These years, researchers have been making efforts for intelligent code completion. However, existing work on intelligent code completion either only considered production code, or did not distinguish between production code and test code. It is unclear how effective existing completion models are for test code completion, nor whether we can further improve it. In this work, we focus on the completion of test code. We first find through experiments that completion models for production code are suboptimal for test code completion. Then we analyze the specific characteristics of test code, and observe that test code has inter- and intra-project similarities, and a strong relationship with its focal class and other production classes depending on the focal class (i.e., focal-related code). By incorporating test code from other projects to fine-tune existing models, we leverage the inter-project similarity of test code to improve the completion of tokens specific to test code. By introducing a local component and constructing existing test code as well as the focal-related code in the project as references, we enhance existing code completion models with the intra-project similarity and the focal-related code of test code. Experiments show that each characteristic of test code we exploit can bring substantial improvement to test code completion and our integrated framework outperforms other baseline frameworks. Compared to the base completion model, on token-level completion, our optimal model for test code completion relatively improves all-token and identifier completion accuracy by 7.68% and 19.96%, respectively; on line-level completion, it relatively improves edit-distance similarity and exact-match metrics by 8.89% and 22.82%, respectively. Moreover, we perform error analysis and point out potential directions for future work.

A unified multi-task learning model for AST-level and token-level code completion

A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning.

Multi-task learning based pre-trained language model for code completion

Towards Full-line Code Completion with Neural Language Models

Deep Learning Based Code Completion Models for Programming Codes.

Deep-AutoCoder: Learning to Complete Code Precisely with Induced Code Tokens

Making Flexible Use of Sub-tasks: A Multiplex Interaction Network for Unified Aspect-based Sentiment Analysis

Adaptive Code Completion with Meta-learning.

CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences

Non-Autoregressive Line-Level Code Completion

Syntax-Aware On-the-Fly Code Completion

Improve Language Modelling for Code Completion through Statement Level Language Model based on Statement Embedding Generated by BiLSTM

A Neural Network Based Intelligent Support Model for Program Code Completion

Improving the Robustness to Data Inconsistency between Training and Testing for Code Completion by Hierarchical Language Model

Exploring and Improving Code Completion for Test Code

Sequence Model Design for Code Completion in the Modern IDE

ReACC: A Retrieval-Augmented Code Completion Framework

When Neural Code Completion Models Size up the Situation: Attaining Cheaper and Faster Completion through Dynamic Model Inference

Improve Language Modeling for Code Completion Through Learning General Token Repetition of Source Code with Optimized Memory

Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases

Prompt-based Code Completion via Multi-Retrieval Augmented Generation