Supplementary Material for Few-Shot Class-Incremental Learning

Xiaoyu Tao,Xiaopeng Hong,Xinyuan Chang,Songlin Dong,Xing Wei,Yihong Gong,Peng Cheng
2020-01-01
Abstract:To train a NG net of N nodes on the base class data, we first extract the feature set F (1) from D. We initialize NG net by randomly selecting N feature vectors from F (1) as the initial centroid vectors of N nodes. The number of nodes is determined according to diversity of F . We ensure the number of NG nodes is larger than the number of classes, so that each class has at least one node for correspondence. Heuristically, we set N = 400 for all datasets. Each node is adapted to f using Eq. (3) in the main paper. For node ri whose rank is i, the contribution by the input vector f is measured using the decay function e−i/α. That is, if node ri has a large rank i α, its distance with the input d(f ,mri) is very large, and we neglect the adaptation to speed up the training. For this purpose, we can set α to a smaller value (e.g. α = 10 in our experiments.) This can reduce the time complexity of “sorting” from O(N log2N) to O(log2N). The topology-preserving mechanism is achieved by the competitive Hebbian learning, where a topological connection between node i and j is established and maintained, if the two nodes are always simultaneously response to the input (i.e., the nearest and second nearest to the input). The “age” of the connection aij is used to record how long the two nodes have not been activated simultaneously. If aij > T , the connection is removed. We set T = 200 for training on the base class data according to [8]. Noting that
What problem does this paper attempt to address?