Abstract:Minimizing the computation complexity is essential for the popularization of deep networks in practical applications. Nowadays, most researches attempt to accelerate deep networks by designing new network structure or compressing the network parameters. Meanwhile, transfer learning techniques such as knowledge distillation are utilized to keep the performance of deep models. In this paper, we focus on accelerating deep models and relieving the computation burden by using low-resolution (LR) images as inputs while maintaining competitive performance, which is rarely researched in the current literature. Deep networks may encounter serious performance degradation when using LR inputs because many details are unavailable from LR images. Besides, the existing approaches may fail to learn discriminative features for LR images because of the dramatic appearance variations between LR and high-resolution (HR) images. To tackle with the above problems, we propose a resolution-aware knowledge distillation (RKD) framework to narrow the cross-resolution variations by transferring knowledge from HR domain to LR domain. The proposed framework consists of a HR teacher network and a LR student network. First, we introduce a discriminator and propose an adversarial learning strategy to shrink the variations between inputs with changing resolution. Then we design a cross-resolution knowledge distillation (CRKD) loss to train discriminative student network by exploiting the knowledge of the teacher network. The CRKD loss is consisted of a resolution-aware distillation loss, a pair-wise constraint, and a maximum mean discrepancy loss. Experimental results on person re-identification, image classification, face recognition, and defect segmentation tasks demonstrate that RKD outperforms traditional knowledge distillation method by achieving better performance with lower computation complexities. Furthermore, CRKD surpasses the state-of-the-art knowledge distillation methods in transferring knowledge across different resolutions under RKD framework, especially when coping with large resolution differences.

Self-Knowledge Distillation with Learning from Role-Model Samples.

Self-Knowledge Distillation via Progressive Associative Learning

Self-Referenced Deep Learning

Self-Distillation from the Last Mini-Batch for Consistency Regularization

Knowledge Distillation Meets Self-Supervision

Self-knowledge Distillation Based on Knowledge Transfer from Soft to Hard Examples.

Multi-target Knowledge Distillation Via Student Self-reflection

Lightweight Self-Knowledge Distillation with Multi-source Information Fusion

Neighbor Self-Knowledge Distillation

Restructuring the Teacher and Student in Self-Distillation

Semi-Online Knowledge Distillation

Small Scale Data-Free Knowledge Distillation

Boosting Knowledge Distillation Via Intra-class Logit Distribution Smoothing

Self-Knowledge Distillation in Natural Language Processing

Resolution-Aware Knowledge Distillation for Efficient Inference.

Self Regulated Learning Mechanism for Data Efficient Knowledge Distillation

Category contrastive distillation with self-supervised classification

Self-knowledge distillation via dropout

Knowledge Distillation Meets Open-Set Semi-supervised Learning

Revisiting Knowledge Distillation Via Label Smoothing Regularization

Extending Label Smoothing Regularization with Self-Knowledge Distillation