Adaptive micro- and macro-knowledge incorporation for hierarchical text classification
Zijian Feng,Kezhi Mao,Hanzhang Zhou
DOI: https://doi.org/10.1016/j.eswa.2024.123374
IF: 8.5
2024-02-16
Expert Systems with Applications
Abstract:Hierarchical text classification (HTC) aims to classify a text into multiple categories organized in a hierarchical structure. The state-of-the-art HTC methods usually employ graph networks, where label graphs are constructed and label representation is learned to interact with text representations for classification. In general, label graphs are built on the intrinsic label hierarchy, label semantic similarity, or label co-occurrence. Such graphs have been proven to be effective, but they only exploit knowledge from training data or simple label descriptions, without considering the vast external knowledge in the open sources. Actually, external knowledge from open sources could bring in complementary information to enhance the label graph's representation power. Motivated by the above considerations, we explore the use of external knowledge for improving HTC in this paper. We categorize knowledge into micro-knowledge and macro-knowledge, which are defined as the fundamental concepts related to a single class label and the correlations among class labels, respectively. For tailor-made incorporation of the two types of knowledge into representation learning and classification, we propose Adaptive Micro- and Macro-Knowledge Incorporation for Hierarchical Text Classification (AMKI-HTC) model in this paper. The micro-knowledge incorporation helps capture class-relevant keywords in the text and hence produce discriminative representations, while the macro-knowledge incorporation improves the accuracy of label graphs. Finally, a confidence maximization fusion strategy is developed for adaptive aggregation of multi-view features. Extensive experiments on three benchmark HTC datasets demonstrate that AMKI-HTC consistently outperforms state-of-the-art models.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science