FineLocator: A Novel Approach to Method-Level Fine-Grained Bug Localization by Query Expansion.

Wen Zhang,Ziqiang Li,Qing Wang,Juan Li
DOI: https://doi.org/10.1016/j.infsof.2019.03.001
IF: 3.9
2019-01-01
Information and Software Technology
Abstract:Context: Bug localization, namely, to locate suspicious snippets from source code files for developers to fix the bug, is crucial for software quality assurance and software maintenance. Effective bug localization technique is desirable for software developers to reduce the effort involved in bug resolution. State-of-the-art bug localization techniques concentrate on file-level coarse-grained localization by lexical matching bug reports and source code files. However, this would bring about a heavy burden for developers to locate feasible code snippets to make change with the goal of fixing the bug. Objective: This paper proposes a novel approach called FineLocator to method-level fine-grained bug localization by using semantic similarity, temporal proximity and call dependency for method expansion. Method: Firstly, the bug reports and the methods of source code are represented by numeric vectors using word embedding (word2vec) and the TF-IDF method. Secondly, we propose three query expansion scores as semantic similarity score, temporal proximity score and call dependency score to address the representation sparseness problem caused by the short lengths of methods in the source code. Then, the representation of a method with short length is augmented by elements of its neighboring methods with query expansion. Thirdly, when a new bug report is incoming, FineLocator will retrieve the methods in source code by similarity ranking on the bug report and the augmented methods for bug localization. Results: We collect bug repositories of ArgoUML, Maven, Kylin, Ant and AspectJ projects to investigate the performance of the proposed FineLocator approach. Experimental results demonstrate that the proposed FineLocator approach can improve the performances of method-level bug localization at average by 20%, 21% and 17% measured by Top-N indicator, MAP and MRR respectively, in comparison with state-of-the-art techniques. Conclusion: This is the first paper to demonstrate how to make use of method expansion to address the representation sparseness problem for method-level fine-grained bug localization.
What problem does this paper attempt to address?