Imbalanced Chinese Multi-label Text Classification Based on Alternating Attention

Hongliang Bi,Han Hu,Pengyuan Liu
2020-01-01
Abstract:In this work, we construct an imbalanced Chinese multi-label text classification dataset, IMCM. The imbalance is mainly reflected in: (1) The degree of discrimination among labels is different. (2) The distribution of labels is moderately imbalanced. Then, we adopt several methods for multi-label classification and conduct thorough evaluation of them, which show that even the most competitive models struggle on this dataset. Therefore, to tackle these imbalanced problems, we proposed an alternating attention model, AltXML. Two attention heads which alternately reading sequence enable the model capture different parts of the document rather than one point. Experimental results show that our proposed model significantly outperforms the state-ofthe-art baselines in our IMCM dataset, and also achieves quite good results in several public datasets.
What problem does this paper attempt to address?