Abstract:Large-scale image datasets are often partially labeled, where only a few categories' labels are known for each image. Assigning pseudo-labels to unknown labels to gain additional training signals has become prevalent for training deep classification models. However, some pseudo-labels are inevitably incorrect, leading to a notable decline in the model classification performance. In this paper, we propose a novel method called Category-wise Fine-Tuning (CFT), aiming to reduce model inaccuracies caused by the wrong pseudo-labels. In particular, CFT employs known labels without pseudo-labels to fine-tune the logistic regressions of trained models individually to calibrate each category's model predictions. Genetic Algorithm, seldom used for training deep models, is also utilized in CFT to maximize the classification performance directly. CFT is applied to well-trained models, unlike most existing methods that train models from scratch. Hence, CFT is general and compatible with models trained with different methods and schemes, as demonstrated through extensive experiments. CFT requires only a few seconds for each category for calibration with consumer-grade GPUs. We achieve state-of-the-art results on three benchmarking datasets, including the CheXpert chest X-ray competition dataset (ensemble mAUC 93.33%, single model 91.82%), partially labeled MS-COCO (average mAP 83.69%), and Open Image V3 (mAP 85.31%), outperforming the previous bests by 0.28%, 2.21%, 2.50%, and 0.91%, respectively. The single model on CheXpert has been officially evaluated by the competition server, endorsing the correctness of the result. The outstanding results and generalizability indicate that CFT could be substantial and prevalent for classification model development. Code is available at:

An Editor Labeling Model for Training Set Expansion in Web Categorization

An experimental study on large-scale web categorization.

Site abstraction for rare category classification in large-scale web directory.

Exploiting Textual and Visual Features for Image Categorization

Collaborative Work with Linear Classifier and Extreme Learning Machine for Fast Text Categorization

Webly-Supervised Fine-Grained Visual Categorization Via Deep Domain Adaptation.

Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels

Weak Learning Algorithm For Multi-Label Multiclass Text Categorization

Fast text categorization based on collaborative work in the semantic and class spaces

Text Categorization Based on Regularization Extreme Learning Machine

Augmenting Labeled Probabilistic Topic Model for Web Service Classification

Learning outliers to refine a corpus for chinese webpage categorization

New Automatic Categorization Algorithm for Chinese Homepages

An Integrated System for Building Enterprise Taxonomies

Learning Semantic Similarity For Multi-Label Text Categorization

Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating Open-Set Noise and Utilizing Hard Examples

Web Video Categorization Using Category-Predictive Classifiers and Category-Specific Concept Classifiers

A New Centroid-Based Classification Model for Text Categorization

Accumulative Categorization: Online 3D Shape Classification for Progressive Collections.

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

Enhancing Robust Text Classification via Category Description