Inverse Image Frequency for Long-tailed Image Recognition

Konstantinos Panagiotis Alexandridis,Shan Luo,Anh Nguyen,Jiankang Deng,Stefanos Zafeiriou
DOI: https://doi.org/10.1109/TIP.2023.3321461
2023-10-07
Abstract:The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8% top-1 accuracy with ResNet50 on ImageNet-LT and 26.2% segmentation AP with MaskRCNN on LVIS. Code available at <a class="link-external link-https" href="https://github.com/kostas1515/iif" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper primarily aims to address the issue of class imbalance in long-tailed distribution datasets, particularly in image recognition tasks. Specifically: 1. **Problem Background**: - Real-world datasets often exhibit a long-tailed distribution, meaning the data categories are extremely imbalanced, with some categories having far more samples than others. - Models trained on such imbalanced data perform well on common categories (head categories) but poorly on rare categories (tail categories). 2. **Objectives**: - Propose a new debiasing method called Inverse Image Frequency (IIF) to improve classification performance on long-tailed datasets. - Focus particularly on reducing false positive detections in downstream tasks such as long-tailed instance segmentation. 3. **Method Advantages**: - IIF is a multiplicative margin adjustment technique that reallocates weights by adjusting the logits of the classification layer. - Compared to additive margin adjustment methods, IIF maintains the sign of the original prediction values, thereby reducing false positive detections. 4. **Experimental Results**: - Extensive experiments on multiple long-tailed benchmark datasets (such as ImageNet-LT, CIFAR-LT, Places-LT, and LVIS) show that the IIF method outperforms existing methods. - Using ResNet50 backbone on ImageNet-LT, IIF achieved a top-1 accuracy of 55.8% and improved the performance on rare categories by 17.5% on the LVIS dataset. Overall, this paper aims to address the class imbalance issue in long-tailed datasets by proposing the IIF method, with a particular emphasis on its effectiveness in instance segmentation tasks.