Abstract:Co-training, an advanced form of self-training, allows multiple base models to learn collaboratively, leading to superior performance in semi-supervised learning tasks. However, its widespread adoption is hindered by high computational costs and intricate design choices. To address these challenges, we present Multi-Head Co-Training, a streamlined and efficient framework that consolidates individual models into a multi-head structure, adding minimal extra parameters. Each classification head in this unified model collaborates with others via a “Weak and Strong Augmentation” strategy, with diversity organically introduced through robust data augmentation. Consequently, our approach implicitly promotes diversity while incurring only a minor increase in computational overhead, making co-training more accessible. We validate the effectiveness of Multi-Head Co-Training through an empirical study on standard semi-supervised learning benchmarks. For example, our method achieves up to a 3.1% accuracy improvement on the semi-supervised CIFAR dataset compared to recent methods.Recognizing the necessity for more practical performance metrics beyond accuracy, we assess our framework from three additional perspectives: robust generalization, uncertainty, and computational efficiency. To evaluate robust generalization, we expand the conventional SSL experimental setting to a more comprehensive open-set semi-supervised learning scenario. For uncertainty assessment, we conduct experiments on model calibration and selective classification benchmarks. For example, our method achieves up to a 4.3% accuracy improvement on the open-set semi-supervised CIFAR dataset. Our extensive experiments confirm that our proposed framework better captures prediction confidence and uncertainty, rendering it more suitable for SSL deployment in open environments. The code is available at https://github.com/chenmc1996/Multi-Head-Co-Training.

Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec

A multiclass classification framework for document categorization

A Fused Multi-feature Based Co-training Approach for Document Clustering

Multi-head Co-Training: an Uncertainty-Aware and Robust Semi-Supervised Learning Framework

Exploring multi-tasking learning in document attribute classification

Deep Learning for Technical Document Classification

Convolutional Long Short-term Memory for Long Length Document Classification

Self-paced Multi-view Co-training.

Semi-Supervised Co-Training Model Using Convolution and Transformer for Hyperspectral Image Classification

VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification

Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning

Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification

Semi-Supervised Learning with Multi-Head Co-Training

Improving Document Classification with Multi-Sense Embeddings

Stacked Co-Training for Semi-Supervised Multi-Label Learning

Diverse Cotraining Makes Strong Semi-Supervised Segmentor

Temporal-Frequency Co-training for Time Series Semi-supervised Learning

Sparse Multiple Instance Learning As Document Classification.

A document image classification system fusing deep and machine learning models

Co-Training Transformer for Remote Sensing Image Classification, Segmentation, and Detection

Inter-training: Exploiting Unlabeled Data in Multi-Classifier Systems