Task-oriented Feature Hallucination for Few-Shot Image Classification.

Sining Wu,Xiang Gao,Xiaopeng Hu
DOI: https://doi.org/10.1049/ipr2.12886
IF: 2.3
2023-01-01
IET Image Processing
Abstract:Data hallucination generates additional training examples for novel classes to alleviate the data scarcity problem in few-shot learning (FSL). Existing hallucination-based FSL methods normally train a general embedding model first by applying information extracted from base classes that have abundant data. In those methods, hallucinators are then built upon the trained embedding model to generate data for novel classes. However, these hallucination methods usually rely on general-purpose embeddings, limiting their ability to generate task-oriented samples for novel classes. Recent studies have shown that task-specific embedding models, which are adapted to novel tasks, can achieve better classification performance. To improve the performance of example hallucination for tasks, a task-oriented embedding model is used in the proposed method to perform task-oriented generation. After the initialization, the hallucinator is finetuned by applying a task-oriented embedding model with the guidance of a teacher-student mechanism. The proposed task-oriented hallucination method contains two steps. An initial embedding network and an initial hallucinator are trained with a base dataset in the first step. The second step contains a pseudo-labelling process where the base dataset is pseudo-labelled using support data of the few-shot task and a task-oriented fine-tuning process where the embedding network and hallucinator are adjusted simultaneously. Both the embedding network and the hallucinator are updated with the support set and the pseudo-labelled base dataset using knowledge distillation. The experiments are conducted on four popular few-shot datasets. The results demonstrate that the proposed approach outperforms state-of-the-art methods with 0.8% to 4.08% increases in classification accuracy for 5-way 5-shot tasks. It also achieves comparable accuracy to state-of-the-art methods for 5-way 1-shot tasks.
What problem does this paper attempt to address?