New words discovery in microblog content

Shuai Huo,Min Zhang,Yiqun Liu,Shaoping Ma
DOI: https://doi.org/10.3969/j.issn.1003-6059.2014.02.007
2014-01-01
Abstract:New words discovery is of great significance in the field of natural language processing. It is more difficult to find new words in microblog than in other corpus. In this paper, an algorithm based on context entropy is proposed, and the new word candidates are filtered based on the context. To improve the precision, lexical features are introduced and an algorithm combining them with term frequency is put forward. Thus, the precision rate and the recall rate are greatly improved, and the F-measure value is up to 89 . 6%.
What problem does this paper attempt to address?