Towards Better Representations for Multi-Label Text Classification with Multi-granularity Information

Fangfang Li,Puzhen Su,Junwen Duan,Weidong Xiao
DOI: https://doi.org/10.18653/v1/2023.findings-emnlp.635
2023-01-01
Abstract:Multi-label text classification (MLTC) aims to assign multiple labels to a given text. Previous works have focused on text representation learning and label correlations modeling using pre-trained language models (PLMs). However, studies have shown that PLMs generate word frequency-oriented text representations, causing texts with different labels to be closely distributed in a narrow region, which is difficult to classify. To address this, we present a novel framework CL (  ̲C ontrastive  ̲L earning)- MIL (  ̲M ulti-granularity  ̲I nformation  ̲L earning) to refine the text representation for MLTC task. We first use contrastive learning to generate uniform initial text representation and incorporate label frequency implicitly. Then, we design a multi-task learning module to integrate multi-granularity (diverse text-labels correlations, label-label relations and label frequency) information into text representations, enhancing their discriminative ability. Experimental results demonstrate the complementarity of the modules in CL-MIL, improving the quality of text representations and yielding stable and competitive improvements for MLTC.
What problem does this paper attempt to address?