Hyperspherical Classification with Dynamic Label-to-Prototype Assignment

Mohammad Saeed Ebrahimi Saadabadi,Ali Dabouei,Sahar Rahimi Malakshan,Nasser M. Nasrabad
2024-03-26
Abstract:Aiming to enhance the utilization of metric space by the parametric softmax classifier, recent studies suggest replacing it with a non-parametric alternative. Although a non-parametric classifier may provide better metric space utilization, it introduces the challenge of capturing inter-class relationships. A shared characteristic among prior non-parametric classifiers is the static assignment of labels to prototypes during the training, ie, each prototype consistently represents a class throughout the training course. Orthogonal to previous works, we present a simple yet effective method to optimize the category assigned to each prototype (label-to-prototype assignment) during the training. To this aim, we formalize the problem as a two-step optimization objective over network parameters and label-to-prototype assignment mapping. We solve this optimization using a sequential combination of gradient descent and Bipartide matching. We demonstrate the benefits of the proposed approach by conducting experiments on balanced and long-tail classification problems using different backbone network architectures. In particular, our method outperforms its competitors by 1.22\% accuracy on CIFAR-100, and 2.15\% on ImageNet-200 using a metric space dimension half of the size of its competitors. Code:
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the insufficient performance of existing image classification methods when dealing with class - imbalanced datasets and the limitations in exploiting the metric space. Specifically: 1. **Class - imbalance problem**: The traditional parametric Softmax classifier (PSC) uses the cross - entropy loss function. When dealing with datasets with long - tailed distributions, for samples of minority classes, "passive updates" occur frequently, resulting in a lower degree of separation between these classes. This limits the model's ability to recognize samples of minority classes. 2. **Insufficient use of metric space**: Existing non - parametric classifiers can provide better utilization of the metric space, but they usually adopt a static label - to - prototype assignment strategy, that is, each prototype always represents a specific class throughout the training process. This strategy ignores the intrinsic relationships between classes, causing the model to be unable to fully utilize the metric space, especially when dealing with multi - modal data distributions. To solve the above problems, the paper proposes a new classification framework that optimizes model performance by dynamically adjusting the label - to - prototype assignment. The specific methods are as follows: - **Fixed but evenly distributed prototypes**: Before training, a set of prototypes evenly distributed on the hypersphere is predetermined through an optimization algorithm. These prototypes remain fixed throughout the training process to ensure the maximum degree of separation between classes. - **Dynamic label - to - prototype assignment**: Different from traditional methods, the framework proposed in the paper allows the labels represented by each prototype to be dynamically adjusted during the training process. By solving a bipartite matching problem, the best label - to - prototype assignment scheme is found, thereby better capturing the intrinsic relationships between classes. - **Input - to - prototype mapping**: For each training sample, the mapping from input to prototype is learned by minimizing the angular distance between the sample feature and its assigned prototype. This method not only reduces the computational burden but also improves the model's ability to recognize minority classes. Through these innovations, the method proposed in the paper has achieved significant performance improvements on multiple datasets, especially when dealing with class - imbalanced datasets. Experimental results show that the classification accuracies of this method on datasets such as CIFAR - 100 and ImageNet - 200 are respectively 1.22% and 2.15% higher than those of existing methods, and it can still maintain good performance when the dimension of the metric space is reduced.