Inverse Image Frequency for Long-tailed Image Recognition

Konstantinos Panagiotis Alexandridis,Shan Luo,Anh Nguyen,Jiankang Deng,Stefanos Zafeiriou

DOI: https://doi.org/10.1109/TIP.2023.3321461

2023-10-07

Abstract:The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8% top-1 accuracy with ResNet50 on ImageNet-LT and 26.2% segmentation AP with MaskRCNN on LVIS. Code available at <a class="link-external link-https" href="https://github.com/kostas1515/iif" rel="external noopener nofollow">this https URL</a>

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper primarily aims to address the issue of class imbalance in long-tailed distribution datasets, particularly in image recognition tasks. Specifically: 1. **Problem Background**: - Real-world datasets often exhibit a long-tailed distribution, meaning the data categories are extremely imbalanced, with some categories having far more samples than others. - Models trained on such imbalanced data perform well on common categories (head categories) but poorly on rare categories (tail categories). 2. **Objectives**: - Propose a new debiasing method called Inverse Image Frequency (IIF) to improve classification performance on long-tailed datasets. - Focus particularly on reducing false positive detections in downstream tasks such as long-tailed instance segmentation. 3. **Method Advantages**: - IIF is a multiplicative margin adjustment technique that reallocates weights by adjusting the logits of the classification layer. - Compared to additive margin adjustment methods, IIF maintains the sign of the original prediction values, thereby reducing false positive detections. 4. **Experimental Results**: - Extensive experiments on multiple long-tailed benchmark datasets (such as ImageNet-LT, CIFAR-LT, Places-LT, and LVIS) show that the IIF method outperforms existing methods. - Using ResNet50 backbone on ImageNet-LT, IIF achieved a top-1 accuracy of 55.8% and improved the performance on rare categories by 17.5% on the LVIS dataset. Overall, this paper aims to address the class imbalance issue in long-tailed datasets by proposing the IIF method, with a particular emphasis on its effectiveness in instance segmentation tasks.

Inverse Image Frequency for Long-tailed Image Recognition

Inverse Weight-Balancing for Deep Long-Tailed Learning

Hierarchical Equalization Loss for Long-Tailed Instance Segmentation

Data-Free Network Debiasing for Long-Tailed Visual Recognition

Divide and Retain: A Dual-Phase Modeling for Long-Tailed Visual Recognition

The Equalization Losses: Gradient-Driven Training for Long-tailed Object Recognition

Equalized Focal Loss for Dense Long-Tailed Object Detection

Decoupling Representation and Classifier for Long-Tailed Recognition

Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks

Adaptive Class Suppression Loss for Long-Tail Object Detection

Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition

Towards Reversal-Invariant Image Representation.

Feature Re-Balancing for Long-Tailed Visual Recognition.

Long-tailed image recognition through balancing discriminant quality

A Deep Learning Model for Long-Tail Visual Recognition

Lp-Norm IDF for Large Scale Image Search

Latent-based Diffusion Model for Long-tailed Recognition

Fractal Calibration for long-tailed object detection

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

Feature-Balanced Loss for Long-Tailed Visual Recognition

Exploring the auxiliary learning for long-tailed visual recognition