Abstract:Abbreviations are widely used in identifiers. However, they have severe negative impact on program comprehension and IR-based software maintenance activities, e.g., concept location, software clustering, and recovery of traceability links. Consequently, a number of efficient approaches have been proposed successfully to expand abbreviations in identifiers. Most of such approaches rely heavily on dictionaries, and rarely exploit the specific and fine-grained context of identifiers. As a result, such approaches are less accurate in expanding abbreviations (especially short ones) that may match multiple dictionary words. To this end, in this paper we propose an automatic approach to improve the accuracy of abbreviation expansion by exploiting the specific and fine-grained context. It focuses on a special but common category of abbreviations (abbreviations in parameter names), and thus it can exploit the specific and fine-grained context, i.e., the type of the enclosing parameter as well the corresponding formal (or actual) parameter name. The recent empirical study on parameters suggest that actual parameters are often lexically similar to their corresponding formal parameters. Consequently, it is likely that an abbreviation in a formal parameter can find its full terms in the corresponding actual parameter, and vice versa. Based on this assumption, a series of heuristics are proposed to look for full terms from the corresponding actual (or formal) parameter names. To the best of our knowledge, we are the first to expand abbreviations by exploiting the lexical similarity between actual and formal parameters. We also search for full terms in the data type of the enclosing parameter. Only if all such heuristics fail, the approach turns to the traditional abbreviation dictionaries. We evaluate the proposed approach on seven well known open-source projects. Evaluation results suggest that when only parameter abbreviations are involved, the proposed approach can improve the precision from 26 to 95 percent and recall from 26 to 65 percent compared against the state-of-the-art general purpose approach. Consequently, the proposed approach could be employed as a useful supplement to existing approaches to expand parameter abbreviations.

A Context-Enhanced Generate-then-Evaluate Framework for Chinese Abbreviation Prediction

Enhancing Chinese abbreviation prediction with LLM generation and contrastive evaluation

Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction.

A Context-Enhanced Transformer with Abbr-Recover Policy for Chinese Abbreviation Prediction

A Sequence-to-Sequence Model for Large-scale Chinese Abbreviation Database Construction.

Evaluating and Improving ChatGPT-Based Expansion of Abbreviations

A transformer-based neural network framework for full names prediction with abbreviations and contexts

Abbreviation Prediction Using Conditional Random Field and Web Data

Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints.

Generating Abbreviations for Chinese Named Entities Using Recurrent Neural Network with Dynamic Dictionary

Generalized Abbreviation Prediction with Negative Full Forms and Its Application on Improving Chinese Web Search

Context-Aware Abbreviation Expansion Using Large Language Models

Constructing Chinese Abbreviation Dictionary: A Stacked Approach.

Predicting Chinese Abbreviations from Definitions: an Empirical Learning Approach Using Support Vector Regression.

Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information

Chinese Abbreviation Identification Using Abbreviation-Template Features and Context Information

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

Automatic and Accurate Expansion of Abbreviations in Parameters

Chinese Abbreviation Identification Using Abbreviation-Template Features *

Evaluating the generation capabilities of large Chinese language models

AGRA: an Analysis-Generation-Ranking Framework for Automatic Abbreviation from Paper Titles.