Applying rough sets to feature extraction in POS tagging

Wei Jiang,Xiaolong Wang,Yi Guan,Zhiming Xu
DOI: https://doi.org/10.3321/j.issn:1002-0470.2006.10.002
2006-01-01
Abstract:In order to extract the complicated contextual features in the part-of-speech tagging task, a novel approach based on rough sets is presented in this paper to collect the complex and long-distance features from the corpus effectively, and to overcome the noise and inconsistent sample problem existing in the corpus. In addition, these rough rules are added into the maximum entropy model. The experiment achieved the precision of 96.29 %, and increased the tagging precision by 0.83 % compared with the former model.
What problem does this paper attempt to address?