Abstract:Hierarchical structures of labels usually exist in large-scale classification tasks, where labels can be organized into a tree-shaped structure. The nodes near the root stand for coarser labels, while the nodes close to leaves mean the finer labels. We label unseen samples from the root node to a leaf node, and obtain multigranularity predictions in the hierarchical classification. Sometimes, we cannot obtain a leaf decision due to uncertainty or incomplete information. In this case, we should stop at an internal node, rather than going ahead rashly. However, most existing hierarchical classification models aim at maximizing the percentage of correct predictions, and do not take the risk of misclassifications into account. Such risk is critically important in some real-world applications, and can be measured by the distance between the ground truth and the predicted classes in the class hierarchy. In this work, we utilize the semantic hierarchy to define the classification risk and design an optimization technique to reduce such risk. By defining the conservative risk and the precipitant risk as two competing risk factors, we construct the balanced conservative/precipitant semantic (BCPS) risk matrix across all nodes in the semantic hierarchy with user-defined weights to adjust the tradeoff between two kinds of risks. We then model the classification process on the semantic hierarchy as a sequential decision-making task. We design an algorithm to derive the risk-minimized predictions. There are two modules in this model: 1) multitask hierarchical learning and 2) deep reinforce multigranularity learning. The first one learns classification confidence scores of multiple levels. These scores are then fed into deep reinforced multigranularity learning for obtaining a global risk-minimized prediction with flexible granularity. Experimental results show that the proposed model outperforms state-of-the-art methods on seven large-scale classification datasets with the sema-tic tree.

Optimize Hierarchical Softmax with Word Similarity Knowledge.

Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Learning Semantic Hierarchies Via Word Embeddings.

Lexical semantics enhanced neural word embeddings

Learning Semantic Hierarchies: a Continuous Vector Space Approach

PE: A Poincare Explanation Method for Fast Text Hierarchy Generation

Hierarchical Semantic Structure Preserving Hashing for Cross-Modal Retrieval

Bridging the Semantic Latent Space Between Brain and Machine: Similarity is All You Need

Global Hierarchical Neural Networks using Hierarchical Softmax

Semantic Word Cloud Generation Based on Word Embeddings

Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models

Chinese Word Similarity Computing Based on Semantic Tree

Inducing Semantic Hierarchy Structure in Empirical Risk Minimization with Optimal Transport Measures

Learning Visual Hierarchies with Hyperbolic Embeddings

Towards hierarchical importance attribu-

Learning Effective Word Embedding Using Morphological Word Similarity

Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond

Learning with Hierarchical Complement Objective

Hierarchical Semantic Risk Minimization for Large-Scale Classification

Hierarchical Latent Semantic Mapping for Automated Topic Generation

Revisit Word Embeddings with Semantic Lexicons for Modeling Lexical Contrast