New Cyber Word Discovery Using Chinese Word Segmentation

Hao Wang,Bing Wang,MengYu Zou,JianYong Duan
DOI: https://doi.org/10.1109/ITNEC.2019.8729065
2019-01-01
Abstract:Increasing new cyber words, if they could not be effectively identified, will seriously affect the accuracy of word segmentation, and bring great difficulties to the related work. In order to solve the problem that the Web text is not specific and the content is short, this paper proposes a new word discovery method combined with statistics and rules, which can be used to optimize the original word segmentation result by two new words extraction and rule filtering. The experimental results show that the method can effectively find new cyber words, and the accuracy of word segmentation has also been improved.
What problem does this paper attempt to address?