Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Simiao Li,Yun Zhang,Wei Li,Hanting Chen,Wenjia Wang,Bingyi Jing,Shaohui Lin,Jie Hu
2024-04-03
Abstract:Knowledge distillation (KD) is a promising yet challenging model compression technique that transfers rich learning representations from a well-performing but cumbersome teacher model to a compact student model. Previous methods for image super-resolution (SR) mostly compare the feature maps directly or after standardizing the dimensions with basic algebraic operations (e.g. average, dot-product). However, the intrinsic semantic differences among feature maps are overlooked, which are caused by the disparate expressive capacity between the networks. This work presents MiPKD, a multi-granularity mixture of prior KD framework, to facilitate efficient SR model through the feature mixture in a unified latent space and stochastic network block mixture. Extensive experiments demonstrate the effectiveness of the proposed MiPKD method.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?