Abstract:Node classification is the task of inferring or predicting missing node attributes from information available for other nodes in a network. This paper presents a general prediction model to hierarchical multi-label classification (HMC), where the attributes to be inferred can be specified as a strict poset. It is based on a top-down classification approach that addresses hierarchical multi-label classification with supervised learning by building a local classifier per class. The proposed model is showcased with a case study on the prediction of gene functions for Oryza sativa Japonica, a variety of rice. It is compared to the Hierarchical Binomial-Neighborhood, a probabilistic model, by evaluating both approaches in terms of prediction performance and computational cost. The results in this work support the working hypothesis that the proposed model can achieve good levels of prediction efficiency, while scaling up in relation to the state of the art.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of node classification in networks, especially the multi - label classification problem when these node attributes have a hierarchical structure. Specifically, it focuses on **Hierarchical Multi - label Classification (HMC)**, where each node can belong to multiple classes simultaneously, and there is a strict partial order relationship between these classes (i.e., forming a directed acyclic graph DAG). The author proposes a top - down supervised learning method to meet this challenge. #### Main problem description: 1. **Node classification problem**: - **Definition**: Infer or predict the missing node attributes from the information of other nodes in the network. - **Background**: Most existing techniques classify each class independently, ignoring the potential relationships between classes, which may lead to inconsistent prediction results. 2. **Hierarchical multi - label classification problem**: - **Definition**: Predict the association between nodes and classes given the network and the hierarchical structure of classes, ensuring that the prediction results conform to the "true - path rule", that is, if a node is predicted to be a certain class, it must also be predicted to be all of its ancestor classes. - **Challenge**: Existing methods either ignore the hierarchical relationships between classes or are too computationally expensive and difficult to scale to large - scale data sets. #### Specific contributions of the paper: - **Propose a new top - down supervised learning model**: By constructing a binary classifier for each class, gradually classify from the root node to the leaf node to ensure the consistency of prediction results. - **Introduce a correction mechanism**: Use cumulative probability to ensure that the prediction results satisfy the true - path rule and avoid inconsistent predictions. - **Apply case study**: Verify the effectiveness and computational efficiency of this model through the gene function prediction of the rice variety Oryza sativa Japonica. #### Formula representation: - **Strict Poset**: \[ (C, \prec) \] where \( C \) is the set of classes, and \(\prec\) is the strict partial order relationship between classes, satisfying asymmetry, anti - reflexivity, and transitivity. - **Cumulative probability calculation**: \[ P(v, C)=\prod_{A \in \text{ancestors}(C)} P(v, A) \] where \( P(v, C) \) represents the probability that node \( v \) belongs to class \( C \), and \(\text{ancestors}(C)\) represents all the ancestor classes of class \( C \). Through this method, the paper not only improves the prediction accuracy but also significantly reduces the computational cost, making it applicable to larger - scale data sets.

A Top-down Supervised Learning Approach to Hierarchical Multi-label Classification in Networks

Hierarchical Multilabel Ship Classification in Remote Sensing Images Using Label Relation Graphs

A Bayesian Network nearest k-labels method for Multi-label classification

HmcNet: A General Approach for Hierarchical Multi-Label Classification

Hierarchy exploitation to detect missing annotations on hierarchical multi-label classification

Semi-Supervised Hierarchical Multi-Label Classifier Based on Local Information

Semi-Supervised Hierarchical Graph Classification

Hierarchical Multi-label Text Classification: an Attention-based Recurrent Network Approach

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

Consistency-aware Multi-modal Network for Hierarchical Multi-label Classification in Online Education System

Semantic Guided Level-Category Hybrid Prediction Network for Hierarchical Image Classification.

Multi-Label Classification Neural Networks with Hard Logical Constraints

Multi-label Classification using Labels as Hidden Nodes

A Capsule Network for Hierarchical Multi-Label Image Classification

LA-HCN: Label-based Attention for Hierarchical Multi-label TextClassification Neural Network

Hierarchical Multi-label Text Classification: Self-adaption Semantic Awareness Network Integrating Text Topic and Label Level Information

Self-Paced Unified Representation Learning for Hierarchical Multi-Label Classification

Cognitive structure learning model for hierarchical multi-label text classification

Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

Hierarchical Multilabel Text Classification Via Multitask Learning.

A Hierarchical Fine-Tuning Approach Based on Joint Embedding of Words and Parent Categories for Hierarchical Multi-label Text Classification