Abstract:Few-shot image classification involves classifying images using very few training examples. Recent vision foundation models show excellent few-shot transfer abilities, but are large and slow at inference. Using knowledge distillation, the capabilities of high-performing but slow models can be transferred to tiny, efficient models. However, common distillation methods require a large set of unlabeled data, which is not available in the few-shot setting. To overcome this lack of data, there has been a recent interest in using synthetic data. We expand on this work by presenting a novel diffusion model inversion technique (TINT) combining the diversity of textual inversion with the specificity of null-text inversion. Using this method in a few-shot distillation pipeline leads to state-of-the-art accuracy among small student models on popular benchmarks, while being significantly faster than prior work. This allows us to push even tiny models to high accuracy using only a tiny application-specific dataset, albeit relying on extra data for pre-training. Popular few-shot benchmarks involve evaluation over a large number of episodes, which is computationally cumbersome for methods involving synthetic data generation. Therefore, we also present a theoretical analysis on how the variance of the accuracy estimator depends on the number of episodes and query examples, and use these results to lower the computational effort required for method evaluation. In addition, to further motivate the use of generative models in few-shot distillation, we demonstrate that our method performs better compared to training on real data mined from the dataset used to train the diffusion model. Source code will be made available at <a class="link-external link-https" href="https://github.com/pixwse/tiny2" rel="external noopener nofollow">this https URL</a>.

Diversified in-domain synthesis with efficient fine-tuning for few-shot classification

Disentangled Feature Representation for Few-shot Image Classification

Learning task-specific discriminative embeddings for few-shot image classification

DataDream: Few-shot Guided Dataset Generation

Few and Fewer: Learning Better from Few Examples Using Fewer Base Classes

Diverse and Tailored Image Generation for Zero-shot Multi-label Classification

Cross-Domain Few-Shot Classification via Dense-Sparse-Dense Regularization

Cross-Domain Few-Shot classification via class-shared and class-specific dictionaries

Few-Shot Object Detection in Unseen Domains

Few-shot Learning for Domain-specific Fine-grained Image Classification

DBDC-SSL: Deep Brownian Distance Covariance With Self-Supervised Learning for Few-Shot Image Classification

Boosting Few-Shot Segmentation via Instance-Aware Data Augmentation and Local Consensus Guided Cross Attention

The More, the Better? Active Silencing of Non-Positive Transfer for Efficient Multi-Domain Few-Shot Classification

Explore the Power of Synthetic Data on Few-shot Object Detection

Tiny models from tiny data: Textual and null-text inversion for few-shot distillation

Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification

Boosting Few-Shot Object Detection with Discriminative Representation and Class Margin

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images

Cross-Domain Few-Shot Classification Via Adversarial Task Augmentation

Adaptive Semantic Consistency for Cross-domain Few-shot Classification

Discriminative Sample-Guided and Parameter-Efficient Feature Space Adaptation for Cross-Domain Few-Shot Learning