Abstract:Few-Shot Learning (FSL) is a challenging task, which aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then performing class prediction via a cosine classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt to diminish the prototype bias by regarding it as a prototype optimization problem. To this end, we propose a novel prototype optimization framework to rectify prototypes, i.e., introducing a meta-optimizer to optimize prototypes. Although the existing meta-optimizers can also be adapted to our framework, they all overlook a crucial gradient bias issue, i.e., the mean-based gradient estimation is also biased on sparse data. To address this issue, in this paper, we regard the gradient and its flow as meta-knowledge and then propose a novel Neural Ordinary Differential Equation (ODE)-based meta-optimizer to optimize prototypes, called MetaNODE. Although MetaNODE has shown superior performance, it suffers from a huge computational burden. To further improve its computation efficiency, we conduct a detailed analysis on MetaNODE and then design an effective and efficient MetaNODE extension version (called E2MetaNODE). It consists of two novel modules: E2GradNet and E2Solver, which aim to estimate accurate gradient flows and solve optimal prototypes in an effective and efficient manner, respectively. Extensive experiments show that 1) our methods achieve superior performance over previous FSL methods and 2) our E2MetaNODE significantly improves computation efficiency meanwhile without performance degradation.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the prototype bias problem in **Few - Shot Learning (FSL)**. Specifically, the goal of FSL is to identify new classes with only a small number of labeled samples. Existing pre - training methods pre - train a feature extractor and use mean - based prototypes and cosine classifiers for class prediction. However, due to data scarcity, mean - based prototypes are usually biased, that is, there is a difference between the calculated mean and the true prototype.
### Problem Background
1. **Few - Shot Learning (FSL)**: FSL aims to use a small number of labeled samples to identify new classes. Existing methods mainly rely on pre - trained feature extractors and mean - based prototypes for classification.
2. **Prototype Bias Problem**: Due to data scarcity, mean - based prototypes are often biased, leading to a decline in classification performance. As shown in Figure 1, the mean - based prototype (orange square) is usually far from the true prototype (triangle), which makes the classification result inaccurate.
### Solution
To solve this problem, the paper proposes a new meta - learning framework, regards the reduction of prototype bias as an optimization problem, and introduces a new meta - optimizer (MetaNODE) to optimize the prototype. The specific contributions are as follows:
1. **New Perspective**: Regard the prototype bias reduction problem as a prototype optimization problem and propose a new meta - learning framework to improve the performance of FSL.
2. **Gradient Bias Problem**: Identify a key problem of existing meta - optimizers, that is, the gradient bias problem. Existing methods are not accurate enough in estimating gradients on sparse data, resulting in poor optimization effects.
3. **Neural ODE - based Meta - optimizer (MetaNODE)**: To alleviate the gradient bias problem, the paper proposes a meta - optimizer MetaNODE based on Neural Ordinary Differential Equation (Neural ODE), which optimizes the prototype by modeling continuous - time dynamics.
4. **Efficient Extended Version (E2MetaNODE)**: To improve computational efficiency, the paper further develops an efficient extended version of MetaNODE, E2MetaNODE, which includes two key modules: E2GradNet and E2Solver, which are used to estimate the gradient flow more accurately and solve for the optimal prototype respectively.
### Method Overview
- **Pre - training Phase**: First, pre - train a feature extractor on all base classes to obtain a good image representation.
- **Meta - training Phase**: Introduce a meta - optimizer gθg(), and optimize the prototype by minimizing the negative log - likelihood estimate on the query set Q. The specific steps include:
- Calculate the initial prototype p(0) for each class.
- Use the meta - optimizer gθg() to optimize the prototype to obtain the optimal prototype p(M).
- Calculate the probability P(y = k|xi, S, Q′, θg) that each sample belongs to a certain class, and use cosine similarity for classification.
- **Meta - testing Phase**: The process is similar to meta - training, but directly use the pre - trained meta - optimizer for few - shot classification.
### Conclusion
Through a large number of experimental verifications, the MetaNODE and E2MetaNODE proposed in the paper show superior performance in few - shot learning tasks, and E2MetaNODE significantly improves computational efficiency while maintaining performance.