Abstract:With the rapid spread of information, abbreviations are used more and more common because they are convenient. However, the duplication of abbreviations can lead to confusion in many cases, such as information management and information retrieval. The resultant confusion annoys users. Thus, inferring a full name from an abbreviation has practical and significant advantages. The bulk of studies in the literature mainly inferred full names based on rule-based methods, statistical models, the similarity of representation, etc. However, these methods are unable to use various grained contexts properly. In this paper, we propose a flexible framework of Multi-attention mask Abbreviation Context and Full name language model, named MACF to address the problem. With the abbreviation and contexts as the inputs, the MACF can automatically predict a full name by generation, where the contexts can be variously grained. That is, different grained contexts ranging from coarse to fine can be selected to perform such complicated tasks in which contexts include paragraphs, several sentences, or even just a few keywords. A novel multi-attention mask mechanism is also proposed, which allows the model to learn the relationships among abbreviations, contexts, and full names, a process that makes the most of various grained contexts. The three corpora of different languages and fields were analyzed and measured with seven metrics in various aspects to evaluate the proposed framework. According to the experimental results, the MACF yielded more significant and consistent outputs than other baseline methods. Moreover, we discuss the significance and findings, and give the case studies to show the performance in real applications.

A Context-Enhanced Transformer with Abbr-Recover Policy for Chinese Abbreviation Prediction

A Context-Enhanced Generate-then-Evaluate Framework for Chinese Abbreviation Prediction

A Sequence-to-Sequence Model for Large-scale Chinese Abbreviation Database Construction.

Enhancing Chinese abbreviation prediction with LLM generation and contrastive evaluation

Abbreviation Prediction Using Conditional Random Field and Web Data

An Improved Chinese Named Entity Recognition Method with TB-LSTM-CRF

A transformer-based neural network framework for full names prediction with abbreviations and contexts

Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction.

Predicting Chinese Abbreviations from Definitions: an Empirical Learning Approach Using Support Vector Regression.

Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints.

Text-conditioned Transformer for Automatic Pronunciation Error Detection

Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input

A Local Information Perception Enhancement–Based Method for Chinese NER

Dependency syntax guided BERT-BiLSTM-GAM-CRF for Chinese NER

Generalized Abbreviation Prediction with Negative Full Forms and Its Application on Improving Chinese Web Search

A hybrid Transformer approach for Chinese NER with features augmentation

MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition

CFNAM-PG: Bridging Phonetic and Glyphic Information for Chinese Full Name and Abbreviation Matching Based on Simbert and DenseNet

Chinese Abbreviation Identification Using Abbreviation-Template Features and Context Information

Generating Abbreviations for Chinese Named Entities Using Recurrent Neural Network with Dynamic Dictionary

Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition