Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification

Simon Yu,Jie He,Víctor Gutiérrez-Basulto,Jeff Z. Pan
2024-06-19
Abstract:Hierarchical multi-label text classification (HMTC) aims at utilizing a label hierarchy in multi-label classification. Recent approaches to HMTC deal with the problem of imposing an over-constrained premise on the output space by using contrastive learning on generated samples in a semi-supervised manner to bring text and label embeddings closer. However, the generation of samples tends to introduce noise as it ignores the correlation between similar samples in the same batch. One solution to this issue is supervised contrastive learning, but it remains an underexplored topic in HMTC due to its complex structured labels. To overcome this challenge, we propose $\textbf{HJCL}$, a $\textbf{H}$ierarchy-aware $\textbf{J}$oint Supervised $\textbf{C}$ontrastive $\textbf{L}$earning method that bridges the gap between supervised contrastive learning and HMTC. Specifically, we employ both instance-wise and label-wise contrastive learning techniques and carefully construct batches to fulfill the contrastive learning objective. Extensive experiments on four multi-path HMTC datasets demonstrate that HJCL achieves promising results and the effectiveness of Contrastive Learning on HMTC.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of how to effectively utilize hierarchical structure information and the correlation between labels to improve classification performance in the task of Hierarchical Multi-Label Text Classification (HMTC). Specifically, existing HMTC methods face the following challenges when dealing with hierarchical labels: 1. **Complexity of label hierarchy**: Existing methods often overlook the correlation between labels across different paths and different levels of abstraction when handling hierarchical labels. 2. **Difficulty in applying contrastive learning**: Traditional contrastive learning methods are difficult to apply directly to HMTC because generating meaningful positive and negative sample pairs is very challenging, especially when labels have a hierarchical structure. 3. **Limitations of data augmentation methods**: Previous methods construct positive sample pairs through data augmentation, but these methods mainly focus on pushing apart labels of different categories without fully utilizing the information within the same category. To address these issues, the paper proposes a method based on supervised contrastive learning—HJCL (Hierarchy-aware Joint Supervised Contrastive Learning), which improves HMTC in the following ways: 1. **Instance-level contrastive learning**: By performing contrastive learning at the instance level, samples with similar label structures are brought closer in the embedding space. 2. **Hierarchy-aware label-enhanced contrastive learning**: By introducing a new contrastive loss function (HiLeCon), the intensity of contrastive learning is adjusted based on the similarity and hierarchical structure of the labels. Experimental results show that HJCL significantly outperforms existing baseline methods on multiple HMTC datasets, particularly excelling in handling datasets with complex hierarchical structures.