Abstract:Chinese spelling correction (CSC) constitutes a pivotal and enduring goal in natural language processing, serving as a foundational element for various language-related tasks by detecting and rectifying spelling errors in textual content. Numerous methods for Chinese spelling correction leverage multimodal information, including character, character sound, and character shape, to establish connections between incorrect and correct characters. Research indicates that a majority of spelling errors stem from pinyin similarity, with character similarity accounting for half of the errors. Consequently, effectively modeling character pinyin and character relationships emerges as a key challenge in the CSC task. In this study, we propose enhancing the CSC task by introducing the pinyin character prediction task. We employ an adaptive weighting method in the pinyin character prediction task to address predictions in a more granular manner, achieving a balance between the two prediction tasks. The proposed model, SPMSpell, utilizes ChineseBERT as an encoder to capture multimodal feature information simultaneously. It incorporates three parallel decoders for character prediction, pinyin prediction, and self-distillation modules. To mitigate potential overfitting concerning pinyin, a self-distillation method is introduced to prioritize character information in predictions. Extensive experiments conducted on three SIGHAN benchmark tests showcase that the model introduced in this paper attains a superior level of performance. This substantiates the correctness and superiority of the adaptive weighted pinyin character prediction task and underscores the effectiveness of the self-distillation module.

Correcting Chinese Spelling Errors with Phonetic Pre-training

PSDSpell: Pre-Training with Self-Distillation Learning for Chinese Spelling Correction

Disentangled Phonetic Representation for Chinese Spelling Correction

Improving Chinese Spelling Correction by Ranking.

Visual and Phonological Feature Enhanced Siamese BERT for Chinese Spelling Error Correction

Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

Investigating Glyph Phonetic Information for Chinese Spell Checking: What Works and What's Next

A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models

Improve Chinese Spelling Check by Reevaluation

Self-Distillation and Pinyin Character Prediction for Chinese Spelling Correction Based on Multimodality

Automatic Chinese Spelling Checking and Correction Based on Character-Based Pre-trained Contextual Representations.

Spelling Error Correction with Soft-Masked BERT

Is Chinese Spelling Check ready? Understanding the correction behavior in real-world scenarios

An Error-Guided Correction Model for Chinese Spelling Error Correction

Chinese Spelling Correction Based on Knowledge Enhancement and Contrastive Learning

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

MCSSpell:Optimal Path Selection of Candidate Characters by Integrating Multimodal Information and Copy Mechanism for Chinese Spelling Correction.

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

MISpeller: Multimodal Information Enhancement for Chinese Spelling Correction

Chinese Spelling Error Detection and Correction Based on Knowledge Graph.

The Past Mistake is the Future Wisdom: Error-driven Contrastive Probability Optimization for Chinese Spell Checking