Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification

Jun Zhang,Yubin Li,Fanfan Shen,Chenxi Xia,Hai Tan,Yanxiang He
DOI: https://doi.org/10.1016/j.knosys.2024.112153
IF: 8.139
2024-06-27
Knowledge-Based Systems
Abstract:Hierarchical text classification, where labels can be modeled as a hierarchical structure, is a special multi-label text classification sub-task. Current methods mainly improve model performance by modeling label dependencies. The model HGCLR with excellent performance obtains hierarchy-aware text representations through contrastive learning, but the degree of hierarchical awareness is insufficient. To address this problem, a new multi-label negative supervision method is proposed to drive text representations of samples with more different labels farther away. In addition, to tackle the problem of label imbalance in hierarchical text classification, asymmetric loss is employed to compute the classification loss, so that the model focuses on learning from difficult samples and the contributions of positive and negative labels to the loss function tend to be balanced. In summary, based on HGCLR, by adding multi-label negative supervision and replacing the classification loss with asymmetric loss, we propose a hierarchy-aware and label balanced model (HALB) for hierarchical text classification. Experimental results demonstrate that HALB outperforms several classical models for hierarchical text classification. HALB achieves the best results on both Micro-F1 and Macro-F1 and obtains average improvements of 0.51% and 1.28% respectively on four datasets compared with the strongest baseline model HGCLR.
computer science, artificial intelligence
What problem does this paper attempt to address?