Chinese keyword extraction model with distributed computing

Tiantian Ding,Wenzhong Yang,Fuyuan Wei,Chao Ding,Peng Kang,Wenxiu Bu
DOI: https://doi.org/10.1016/j.compeleceng.2021.107639
2022-01-01
Abstract:The content and structure of the news text are relatively complex and cannot effectively capture the core content. Existing supervision models cannot achieve good results in areas with less annotated data. To this end, we propose a new Chinese keyword extraction model with a distributed computing method. Precisely, we first fused the Bidirectional Encoder Representation from Transformers (BERT) and Conditional Random Fields (CRF) so that each word learns its relationship with the context while reducing errors; secondly, the adversarial training encourages the model to retain a small amount of annotations Sample knowledge to help extract keywords from unannotated samples; and because the model contains a large number of time-consuming components, it creatively uses distributed computing to save overall computing time. The results show that our model can steadily improve the performance of keyword phrase extraction in areas with insufficient labeled samples.
engineering, electrical & electronic,computer science, interdisciplinary applications, hardware & architecture
What problem does this paper attempt to address?