Multiple weak supervision for short text classification
Li-Ming Chen,Bao-Xin Xiu,Zhao-Yun Ding
DOI: https://doi.org/10.1007/s10489-021-02958-3
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:Abstract For short text classification, insufficient labeled data, data sparsity, and imbalanced classification have become three major challenges. For this, we proposed multiple weak supervision, which can label unlabeled data automatically. Different from prior work, the proposed method can generate probabilistic labels through conditional independent model. What’s more, experiments were conducted to verify the effectiveness of multiple weak supervision. According to experimental results on public dadasets, real datasets and synthetic datasets, unlabeled imbalanced short text classification problem can be solved effectively by multiple weak supervision. Notably, without reducing precision , recall , and F1-score can be improved by adding distant supervision clustering, which can be used to meet different application needs.
computer science, artificial intelligence