Lightweight global and local contexts guided method name recommendation with prior knowledge

Shangwen Wang,Ming Wen,Bo Lin,Xiaoguang Mao
DOI: https://doi.org/10.1145/3468264.3468567
2021-01-01
Abstract:ABSTRACTThe quality of method names is critical for the readability and maintainability of source code. However, it is often challenging to construct concise method names. To alleviate this problem, a number of approaches have been proposed to automatically recommend high-quality names for methods. Despite being effective, existing approaches meet their bottlenecks mainly in two aspects: (1) the leveraged information is restricted to the target method itself; and (2) lack of distinctions towards the contributions of tokens extracted from different program contexts. Through a large-scale empirical analysis on +12M methods from +14K real-world projects, we found that (1) the tokens composing a method’s name can be frequently observed in its callers/callees; and (2) tokens extracted from different specific contexts have diverse probabilities to compose the target method’s name. Motivated by our findings, we propose, in this paper, a context-guided method name recommender, which mainly embodies two key ideas: (1) apart from the local context, which is extracted from the target method itself, we also consider the global context, which is extracted from other methods in the project that have call relations with the target method, to include more useful information; and (2) we utilize our empirical results as the prior knowledge to guide the generation of method names and also to restrict the number of tokens extracted from the global contexts. We implemented the idea as Cognac and performed extensive experiments to assess its effectiveness. Results reveal that can (1) perform better than existing approaches on the method name recommendation task (e.g., it achieves an F-score of 63.2%, 60.8%, 66.3%, and 68.5%, respectively, on four widely-used datasets, which all outperform existing techniques); and (2) achieve higher performance than existing techniques on the method name consistency checking task (e.g., its overall accuracy reaches 76.6%, outperforming the state-of-the-art MNire by 11.2%). Further results reveal that the caller/callee information and the prior knowledge all contribute significantly to the overall performance of Cognac.
What problem does this paper attempt to address?