Abstract:When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition. However, pseudo-labels are often noisy, containing numerous incorrect tokens. Taking noisy labels as ground-truth in the loss function results in suboptimal performance. Previous works attempted to mitigate this issue by either filtering out the nosiest pseudo-labels or improving the overall quality of pseudo-labels. While these methods are effective to some extent, it is unrealistic to entirely eliminate incorrect tokens in pseudo-labels. In this work, we propose a novel framework named alternative pseudo-labeling to tackle the issue of noisy pseudo-labels from the perspective of the training objective. The framework comprises several components. Firstly, a generalized CTC loss function is introduced to handle noisy pseudo-labels by accepting alternative tokens in the positions of incorrect tokens. Applying this loss function in pseudo-labeling requires detecting incorrect tokens in the predicted pseudo-labels. In this work, we adopt a confidence-based error detection method that identifies the incorrect tokens by comparing their confidence scores with a given threshold, thus necessitating the confidence score to be discriminative. Hence, the second proposed technique is the contrastive CTC loss function that widens the confidence gap between the correctly and incorrectly predicted tokens, thereby improving the error detection ability. Additionally, obtaining satisfactory performance with confidence-based error detection typically requires extensive threshold tuning. Instead, we propose an automatic thresholding method that uses labeled data as a proxy for determining the threshold, thus saving the pain of manual tuning.

Advanced pseudo-labeling approach in mixing-based text data augmentation method

Explainability-Based Mix-Up Approach for Text Data Augmentation

Mixing Approach for Text Data Augmentation Based on an Ensemble of Explainable Artificial Intelligence Methods

DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification

ASMix: an Attention-based Smooth Data Augmentation Approach.

Synergistic Training: Harnessing Active Learning and Pseudo-Labeling for Enhanced Model Performance in Deep Learning

ADVMIX: Data Augmentation for Accurate Scene Text Spotting

Adversarial Word Dilution as Text Data Augmentation in Low-Resource Regime

Enhancing Effectiveness and Robustness in a Low-Resource Regime via Decision-Boundary-aware Data Augmentation

MixGen: A New Multi-Modal Data Augmentation

Toward Robustness in Multi-label Classification: A Data Augmentation Strategy against Imbalance and Noise

MSMix:An Interpolation-Based Text Data Augmentation Method Manifold Swap Mixup

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

WeMix: How to Better Utilize Data Augmentation

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data

TransformMix: Learning Transformation and Mixing Strategies from Data

For Better or For Worse? Learning Minimum Variance Features With Label Augmentation

ATPL: Mutually enhanced adversarial training and pseudo labeling for unsupervised domain adaptation

Learning with Different Amounts of Annotation: From Zero to Many Labels

AugGPT: Leveraging ChatGPT for Text Data Augmentation

Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks