Utilizing Edge Attention in Graph-Based Code Search

Wei Zhao,Yan Liu
DOI: https://doi.org/10.18293/seke2022-078
2022-01-01
Abstract:Code search refers to searching code snippets with specific functions in a large codebase according to natural language description.Classic code search approaches, using information retrieval technologies, fail to utilize code semantics and bring noisy and irrelevant keywords.During the last recent years, there has been an ample increase in the number of deep learning-based approaches, which embeds lexical semantics into unified vectors to achieve higher-level mapping between natural language queries and source code.However, these approaches are struggling with how to mine and utilize deep code semantics.In this work, we study how to leverage deeper source code semantics in graph-based source code search, given graph-based representation is a promising way of capturing program knowledge and has rich explainability.We propose a novel code search approach called EAGCS (Edge Attention-based Graph Code Search), which is composed of a novel code graph representation method called APDG (Advanced Program Dependence Graph) and a graph neural network called EAGGNN (Edge Attention-based GGNN) which can learn the latent code semantics of APDG.Experiment results demonstrate that our model outperforms the GGNN-based search model and DeepCS.Moreover, our comparison study shows that different edge enhancement strategies have different contributions to learning the code semantics.
What problem does this paper attempt to address?