MVG: Multi-view Graph Representation Learning for Programming Language Processing

Ting Long,Yutong Xie,Xianyu Chen,Weinan Zhang,Qinxiang Cao,Yong Yu
2022-01-01
Abstract:Program representation, which aims at automatically extracting features from source code and representing programs as vectors, is a fundamental problem in programming language processing (PLP). Recent works try to represent programs with neural networks based on the structures of source code. However, such methods often focus on the syntax and consider only one single perspective of programs, limiting the representation power of models. In this paper, we propose a multi-view graph (MVG) representation method. MVG pays more attention to the semantics of code and include both data flow and control flow simultaneously as multiple views. We combine these views together and process them with a graph neural network (GNN) to obtain a program representation that covers various aspects. Our proposed MVG approach is thoroughly evaluated in terms of algorithm detection, an important and challenging subfiled of PLP. Specifically, we use a public dataset \texttt{POJ-104} and also construct a new dataset \texttt{ALG-109} to test our method. In experiments, MVG outperforms previous methods significantly, demonstrating our model's strong capability for representing source code.
What problem does this paper attempt to address?