Hierarchical Text Classification Based on the End-to-End MCHA-BERT

Wei Huang,Guiquan Liu
DOI: https://doi.org/10.1109/icftic54370.2021.9647279
2021-01-01
Abstract:Hierarchical text classification (HTC) is a significant task, in which the text has multiple labels that constitute a hierarchical structure. Because of its significant application value, HTC has attracted the attention of researchers in the field of natural language processing. Existing methods tend to adopt the same strategy to tackle different levels' label prediction, which ignores the difference between various levels. Moreover, they don't utilize upper-level effective information and discards part of the hierarchical attributes. In this paper, we propose an end-to-end MCHA-BERT model, where a multi-granularity convolution module (MCM) and a hierarchical attention mechanism (HAM) are introduced based on multi-task structure. Firstly, every level's label prediction is regarded as one task, and different strategies are adopted for different tasks. In MCM, multi-size 1D convolutions are adopted to encode features of various granularities for different levels' label prediction. Secondly, HAM is proposed to fully integrate different levels' multi-granularity features for subsequent prediction. In general, experiments prove that our proposed MCHA-BERT outperforms other main methods on HTC. The ablation experiments also illustrate the effectiveness of MCM and HAM respectively.
What problem does this paper attempt to address?