Abstract:Identifiers play an important role in helping developers analyze and comprehend source code. However, many identifiers exist that are inconsistent with the corresponding code conventions or semantic functions, leading to flawed identifiers. Hence, identifiers need to be renamed regularly. Even though researchers have proposed several approaches to identify identifiers that need renaming and further suggest correct identifiers for them, these approaches only focus on a single or a limited number of granularities of identifiers without universally considering all the granularities and suggest a series of sub-tokens for composing identifiers without completely generating new identifiers. In this article, we propose a novel identifier renaming prediction and suggestion approach. Specifically, given a set of training source code, we first extract all the identifiers in multiple granularities. Then, we design and extract five groups of features from identifiers to capture inherent properties of identifiers themselves and the relationships between identifiers and code conventions, as well as other related code entities, enclosing files, and change history. By parsing the change history of identifiers, we can figure out whether specific identifiers have been renamed or not. These identifier features and their renaming history are used to train a Random Forest classifier, which can be further used to predict whether a given new identifier needs to be renamed or not. Subsequently, for the identifiers that need renaming, we extract all the related code entities and their renaming change history. Based on the intuition that identifiers are co-evolved as their relevant code entities with similar patterns and renaming sequences, we could suggest and recommend a series of new identifiers for those identifiers. We conduct extensive experiments to validate our approach in both the Java projects and the Android projects. Experimental results demonstrate that our approach could identify identifiers that need renaming with an average F-measure of more than 89%, which outperforms the state-of-the-art approach by 8.30% in the Java projects and 21.38% in the Android projects. In addition, our approach achieves a Hit@10 of 48.58% and 40.97% in the Java and Android projects in suggesting correct identifiers and outperforms the state-of-the-art approach by 29.62% and 15.75%, respectively.

Pre-Implementation Method Name Prediction for Object-Oriented Programming

A Hybrid Code Representation Learning Approach for Predicting Method Names

Properly and Automatically Naming Java Methods: A Machine Learning Based Approach

Abstract Syntax Tree for Method Name Prediction: How Far Are We?

An intelligent java method name recommendation framework via two-phase neural networks

A PERFORMANCE PREDICTION METHOD BASED ON NAIVE BAYES CLASSIFICATION

Case-Based Meta-Prediction for Bioinformatics.

Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach.

Enhancing Function Name Prediction using Votes-Based Name Tokenization and Multi-Task Learning

Learning to Recommend Method Names with Global Context

How Important Are Good Method Names in Neural Code Generation? A Model Robustness Perspective

Self Learning from Large Scale Code Corpus to Infer Structure of Method Invocations

Fine-Grained Software Defect Prediction Based on the Method-Call Sequence

How are We Detecting Inconsistent Method Names? An Empirical Study from Code Review Perspective

Represent Code As Action Sequence for Predicting Next Method Call

An Accurate Identifier Renaming Prediction and Suggestion Approach

Enhancing Software Co-Change Prediction: Leveraging Hybrid Approaches for Improved Accuracy

Method-Level Bug Prediction: Problems and Promises

MetaPredictor: in silico prediction of drug metabolites based on deep language models with prompt engineering

Syntax‐based metamorphic relation prediction via the bagging framework

Heuristic and Neural Network Based Prediction of Project-Specific API Member Access