Pre-Implementation Method Name Prediction for Object-Oriented Programming

Shangwen Wang,Ming Wen,Bo Lin,Yepang Liu,Tegawendé F. Bissyandé,Xiaoguang Mao
DOI: https://doi.org/10.1145/3597203
IF: 3.685
2023-05-13
ACM Transactions on Software Engineering and Methodology
Abstract:Method naming is a challenging development task in object-oriented programming. In recent years, several research efforts have been undertaken to provide automated tool support for assisting developers in this task. In general, literature approaches assume the availability of method implementation to infer its name. Methods however are usually named before their implementations. In this work, we fill the gap in the literature about method name prediction by developing an approach that predicts the names of all methods to be implemented within a class. Our work considers the class name as the input: the overall intuition is that classes with semantically similar names tend to provide similar functionalities, and hence similar method names. We first conduct a large-scale empirical analysis on 258K+ classes from real-world projects to validate our hypotheses. Then, we propose a hybrid big code-driven approach, Mario , to predict method names based on the class name: we combine a deep learning model with heuristics summarized from code analysis. Extensive experiments on 22K+ classes yielded promising results: compared to the state-of-the-art code2seq model (which leverages method implementation data), our approach achieves comparable results in terms of F-score at token level prediction; our approach, additionally, outperforms code2seq in prediction at the name level. We further show that our approach significantly outperforms several other baselines.
computer science, software engineering
What problem does this paper attempt to address?