Abstract:Zero-shot learning (ZSL) aims to predict unseen classes without using samples of these classes in model training. The ZSL has been widely used in many knowledge-based models and applications to predict various parameters, including categories, subjects, and anomalies, in different domains. Nonetheless, most existing ZSL methods require the pre-defined semantics or attributes of particular data environments. Therefore, these methods are difficult to be applied to general data environments, such as ImageNet and other real-world datasets and applications. Recent research has tried to use open knowledge to enhance the ZSL methods to adapt it to an open data environment. However, the performance of these methods is relatively low, namely the accuracy is normally below 10%, which is due to the inadequate semantics that can be used from open knowledge. Moreover, the latest methods suffer from a significant "semantic gap" problem between the generated features of unseen classes and the real features of seen classes. To this end, this paper proposes a multi-view graph representation with a similarity diffusion model, applying the ZSL tasks to general data environments. This model applies a multi-view graph to enhance the semantics fully and proposes an innovative diffusion method to augment the graph representation. In addition, a feature diffusion method is proposed to augment the multi-view graph representation and bridge the semantic gap to realize zero-shot predicting. The results of numerous experiments in general data environments and on benchmark datasets show that the proposed method can achieve new state-of-the-art results in the field of general zero-shot learning. Furthermore, seven ablation studies analyze the effects of the settings and different modules of the proposed method on its performance in detail and prove the effectiveness of each module.

Multi-level Fusion of Multi-modal Semantic Embeddings for Zero Shot Learning

Joint Learning of Attended Zero-Shot Features and Visual-Semantic Mapping.

Dual Collaborative Visual-Semantic Mapping for Multi-Label Zero-Shot Image Recognition

Multi-modal Generative Adversarial Network for Zero-Shot Learning

Adaptive multi-scale semantic fusion network for zero-shot learning

Multi-Label Zero-Shot Learning with Structured Knowledge Graphs

Meta-Transfer Networks for Zero-Shot Learning

OntoZSL: Ontology-enhanced Zero-shot Learning

Semantic Softmax Loss for Zero-Shot Learning

Transductive multi-view zero-shot learning

Transductive Multi-label Zero-shot Learning.

Transductive Multi-class and Multi-label Zero-shot Learning

Manifold Regularized Cross-Modal Embedding for Zero-Shot Learning

Multi-modal Multi-grained Embedding Learning for Generalized Zero-Shot Video Classification

Multi-view graph representation with similarity diffusion for general zero-shot learning

Weakly Supervised Classification Model for Zero‐shot Semantic Segmentation

Multi-Knowledge Fusion for New Feature Generation in Generalized Zero-Shot Learning

Transductive Unbiased Embedding for Zero-Shot Learning

Multi-Semantic Hypergraph Neural Network for Effective Few-Shot Learning

Semantic-visual shared knowledge graph for zero-shot learning

Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths.