GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification

Qing Wang,Jia Zhu,Hongji Shu,Kwame Omono Asamoah,Jianyang Shi,Cong Zhou
DOI: https://doi.org/10.1016/j.jksuci.2023.03.009
IF: 9.006
2023-03-26
Journal of King Saud University - Computer and Information Sciences
Abstract:Highlights • A novel guide network (GUDN) with a label reinforcement strategy is proposed to deal with extreme multi-label text classification (XMTC). • We do careful experiments on several datasets to explore label semantic information's impact on XMTC. • Experimental results show that GUDN and the label reinforcement strategy are helpful. Extreme multi-label text classification (XMTC) is an emerging and essential task in natural language processing. Its objective is to retrieve the most relevant labels for a text from a large set of labels while balancing time and accuracy. Although large-scale pre-trained models have brought new perspectives to this task, more attention should be given to valuable fine-tuned methods and the significant semantic gap between texts and labels. In this paper, we propose a novel guide network (GUDN) with a label reinforcement strategy based on label semantics to help fine-tune pre-trained models for classification. Experimental results demonstrate that GUDN outperforms state-of-the-art methods on Eurlex-4k and achieves competitive results on other popular datasets. In addition, we find that meaningless tokens can harm the Transformer-based model's classification accuracy in another experiment. We conclude that GUDN is effective in the presence of solid semantics. Our source code is available at https://t.hk.uy/aFSH.
computer science, information systems
What problem does this paper attempt to address?