Aspect Term Extraction via Contrastive Learning over Self-augmented Data

Yu Hong,Qingting Xu,Jianmin Yao,Jiaxiang Chen,Yuchen Pan
DOI: https://doi.org/10.1109/IJCNN55064.2022.9891883
2022-07-18
Abstract:Aspect Term Extraction (ATE) is a natural language processing task, which identifies the languages describing product attributes. Such languages (words) are referred to aspect terms in this field. The current neural ATE models suffer from sparsity of available training data. As a result, they fail to be robust in real applications due to overfitting. More seriously, the distinguishable underlying features cannot be learned sufficiently by neural networks, which causes high misjudgement rates. Deliberate data expansion by human annotation undoubtedly helps to alleviate the problem. However, it is time-consuming. In order to overcome the bottleneck, we utilize the Regularized Dropout (R-Drop) approach to implement self data augmentation, creating variant distributed representations for learning in the real-time computation process of neural networks. More importantly, we propose to conduct contrastive learning over the self-augmented data, which sufficiently leverages the variant distributed representations to explore the distinguishable features. We experiment on four widely-used benchmark datasets (R14-16 and L14) in the shared tasks of Semantic Evaluation (SemEval). Experimental results show that contrastive learning over self-augmented data yields significant performance gains, where the improvement is up to 1.9% F1-score at best. In addition, it is demonstrated in the experiments that our method outperforms the state of the art on two test sets (R14 and R16), and meanwhile achieves competitive performance on the rest test sets.
Computer Science
What problem does this paper attempt to address?