Abstract:Code completion, one of the most useful features in the Integrated Development Environments (IDEs), can accelerate software development by suggesting the libraries, APIs, and method names in real-time. Recent studies have shown that statistical language models can improve the performance of code completion tools through learning from large-scale software repositories. However, these models suffer from three major drawbacks: a) The hierarchical structural information of the programs is not fully utilized in the program's representation; b) In programs, the semantic relationships can be very long. Existing recurrent neural networks based language models are not sufficient to model the long-term dependency. c) Existing approaches perform a specific task in one model, which leads to the underuse of the information from related tasks. To address these challenges, in this paper, we propose a self-attentional neural architecture for code completion with multi-task learning. To utilize the hierarchical structural information of the programs, we present a novel method that considers the path from the predicting node to the root node. To capture the long-term dependency in the input programs, we adopt a self-attentional architecture based network as the base language model. To enable the knowledge sharing between related tasks, we creatively propose a Multi-Task Learning (MTL) framework to learn two related tasks in code completion jointly. Experiments on three real-world datasets demonstrate the effectiveness of our model when compared with state-of-the-art methods.

Convolutional Neural Networks over Tree Structures for Programming Language Processing

When Are Tree Structures Necessary for Deep Learning of Representations?

Tree-based convolution: A new architecture for sentence modeling

Discriminative Neural Sentence Modeling by Tree-Based Convolution

Learning Program Representations with a Tree-Structured Transformer

Tree-based Convolution for Sentence Modeling.

Natural Language Inference by Tree-Based Convolution and Heuristic Matching

Recognizing Entailment and Contradiction by Tree-based Convolution.

Context-Aware Tree-Based Convolutional Neural Networks for Natural Language Inference.

Modular Tree Network for Source Code Representation Learning

TreeBERT: A Tree-Based Pre-Trained Model for Programming Language

Tree-to-tree Neural Networks for Program Translation

On Tree-Based Neural Sentence Modeling

FCNN: Simple neural networks for complex code tasks

Dynamic Compositional Neural Networks over Tree Structure

TreeNet: Learning Sentence Representations with Unconstrained Tree Structure.

Attention-driven tree-structured convolutional LSTM for high dimensional data understanding

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning.

TreeGen: A Tree-Based Transformer Architecture for Code Generation

A Grammar-Based Structural CNN Decoder for Code Generation