Abstract:Abstract Existing knowledge distillation methods mostly focus on distillation of teacher’s prediction and intermediate activation. However, the structured representation, which arguably is one of the most critical ingredients of deep models, is largely overlooked. In this work, we propose a novel semantic representational distillation (SRD) method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student. The key idea is that we leverage the teacher’s classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions. This is accomplished by introducing a notion of cross-network logit computed through passing student’s representation into teacher’s classifier. Further, considering the set of seen classes as a basis for the semantic space in a combinatorial perspective, we scale SRD to unseen classes for enabling effective exploitation of largely available, arbitrary unlabeled training data. At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL). Extensive experiments show that our SRD outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks, as well as less studied yet practically crucial binary network distillation. Under more realistic open-set SSL settings we introduce, we reveal that knowledge distillation is generally more effective than existing out-of-distribution sample detection, and our proposed SRD is superior over both previous distillation and SSL competitors. The source code is available at https://github.com/jingyang2017/SRD_ossl .

Label Distribution-based Open-world Semi-supervised Learning

Robust Semi-Supervised Learning for Self-learning Open-World Classes

Rethinking Open-World Semi-Supervised Learning: Distribution Mismatch and Inductive Inference

Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding

Promote knowledge mining towards open-world semi-supervised learning

Robust Semi-Supervised Learning when Not All Classes have Labels

An Empirical Study and Analysis on Open-Set Semi-Supervised Learning

Open-World Semi-Supervised Learning for Node Classification

Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning

Towards Realistic Long-tailed Semi-supervised Learning in an Open World

Semi-Supervised Dual Relation Learning for Multi-Label Classification

A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised Learning

Open-Domain Semi-Supervised Learning via Glocal Cluster Structure Exploitation

Towards Unbiased Training in Federated Open-world Semi-supervised Learning

DeLaLA: Semisupervised Learning via Determinately Labeling and Kernelized Large Margin Projection

Knowledge Distillation Meets Open-Set Semi-supervised Learning

On Non-Random Missing Labels in Semi-Supervised Learning

Improving Barely Supervised Learning by Discriminating Unlabeled Samples with Super-Class

On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning

Semi-Supervised Learning via Weight-aware Distillation under Class Distribution Mismatch

LaSSL: Label-Guided Self-Training for Semi-supervised Learning