Abstract:Webly supervised learning has attracted increasing attention for its effectiveness in exploring publicly accessible data at scale without manual annotation. However, most existing methods of learning with web datasets are faced with challenges from label noise, and they have limited assumptions on clean samples under various noise. For instance, web images retrieved with queries of tiger cat (a cat species) and drumstick (a musical instrument) are almost dominated by images of tigers and chickens, which exacerbates the challenge of fine-grained visual concept learning. In this case, exploiting both web images and their associated texts is a requisite solution to combat real-world noise. In this paper, we propose Cross-modality Aligned Prototypes (CAPro), a unified prototypical contrastive learning framework to learn visual representations with correct semantics. For one thing, we leverage textual prototypes, which stem from the distinct concept definition of classes, to select clean images by text matching and thus disambiguate the formation of visual prototypes. For another, to handle missing and mismatched noisy texts, we resort to the visual feature space to complete and enhance individual texts and thereafter improve text matching. Such semantically aligned visual prototypes are further polished up with high-quality samples, and engaged in both cluster regularization and noise removal. Besides, we propose collective bootstrapping to encourage smoother and wiser label reference from appearance-similar instances in a manner of dictionary look-up. Extensive experiments on WebVision1k and NUS-WIDE (Web) demonstrate that CAPro well handles realistic noise under both single-label and multi-label scenarios. CAPro achieves new state-of-the-art performance and exhibits robustness to open-set recognition. Codes are available at <a class="link-external link-https" href="https://github.com/yuleiqin/capro" rel="external noopener nofollow">this https URL</a>.

ProPC: A Dataset for In-Domain and Cross-Domain Proposition Classification Tasks

Proposition from the Perspective of Chinese Language: A Chinese Proposition Classification Evaluation Benchmark

MuCPAD: A Multi-Domain Chinese Predicate-Argument Dataset

Text Classification via Large Language Models

The Proposition Bank: An Annotated Corpus of Semantic Roles

LogicPrpBank: A Corpus for Logical Implication and Equivalence

Topic-bridged PLSA for Cross-Domain Text Classification

Cross-domain Constituency Parsing by Leveraging Heterogeneous Data

CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes

Classifying multilingual party manifestos: Domain transfer across country, time, and genre

InProC: Industry and Product/Service Code Classification

Probing Classifiers: Promises, Shortcomings, and Advances

Semantic Role Classification Based on Peking University Chinese NetBank

A Benchmark for Cross-Domain Argumentative Stance Classification on Social Media

Exploiting Ontological Reasoning In Argumentation Based Multi-Agent Collaborative Classification

CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

LegalPro-BERT: Classification of Legal Provisions by fine-tuning BERT Large Language Model

Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code

Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities

Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

Cross-Domain Labeled LDA for Cross-Domain Text Classification