Reinforcement Learning-Guided Semi-Supervised Learning

Marzi Heidari,Hanping Zhang,Yuhong Guo

2024-05-03

Abstract:In recent years, semi-supervised learning (SSL) has gained significant attention due to its ability to leverage both labeled and unlabeled data to improve model performance, especially when labeled data is scarce. However, most current SSL methods rely on heuristics or predefined rules for generating pseudo-labels and leveraging unlabeled data. They are limited to exploiting loss functions and regularization methods within the standard norm. In this paper, we propose a novel Reinforcement Learning (RL) Guided SSL method, RLGSSL, that formulates SSL as a one-armed bandit problem and deploys an innovative RL loss based on weighted reward to adaptively guide the learning process of the prediction model. RLGSSL incorporates a carefully designed reward function that balances the use of labeled and unlabeled data to enhance generalization performance. A semi-supervised teacher-student framework is further deployed to increase the learning stability. We demonstrate the effectiveness of RLGSSL through extensive experiments on several benchmark datasets and show that our approach achieves consistent superior performance compared to state-of-the-art SSL methods.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

The paper proposes a Reinforcement Learning (RL)-guided Semi-Supervised Learning (SSL) method called RLGSSL to address the problem of effectively utilizing a small amount of labeled data and a large amount of unlabeled data in SSL. Current SSL methods mainly rely on heuristics or predefined rules to generate pseudo-labels, while RLGSSL formalizes SSL as a multi-armed bandit problem and adaptively guides the learning process of the predictive model through an innovative weighted reward RL loss function. This approach balances the use of labeled and unlabeled data, improves generalization performance, and introduces a semi-supervised teacher-student framework to enhance learning stability. Experimental results demonstrate that RLGSSL outperforms other SSL methods on multiple benchmark datasets.

Reinforcement Learning-Guided Semi-Supervised Learning

Meta-Semi: A Meta-learning Approach for Semi-supervised Learning.

SemiReward: A General Reward Model for Semi-supervised Learning

LaSSL: Label-Guided Self-Training for Semi-supervised Learning

Robust Deep Semi-Supervised Learning: A Brief Introduction

Robust Semi-Supervised Learning when Not All Classes have Labels

Learning Safe Prediction for Semi-Supervised Regression

When Semi-Supervised Learning Meets Transfer Learning: Training Strategies, Models and Datasets.

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Towards Automated Semi-Supervised Learning

Semi-Supervised Empirical Risk Minimization: Using unlabeled data to improve prediction

Rethinking Semi-supervised Learning with Language Models

Towards Semi-supervised Learning with Non-random Missing Labels

From Obstacles to Resources: Semi-supervised Learning Faces Synthetic Data Contamination

Semi-Supervised Reward Modeling via Iterative Self-Training

Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data.

Improving Barely Supervised Learning by Discriminating Unlabeled Samples with Super-Class

Robust Pseudo-Label Selection for Holistic Semi-Supervised Learning

Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning

Robust Self-Tuning Semi-Supervised Learning.