Research on Automatic Recognition of Separable Words in Modern Chinese
Bo Liu,Xue-dong Tian,Xin Fu Li
DOI: https://doi.org/10.4028/www.scientific.net/AMM.670-671.1493
2014-10-01
Abstract:Separable words have important applications in many fields such as Chinese information processing, Chinese-English translation, teaching Chinese as a foreign language. There are about five thousand separable words distribute in the corpus of Chinese, and the word frequency is greater in the novel, so the study on identification of separable words is significant. This paper selects the higher discrete frequency of verb-object separable words as the object of the study, by examining the manifestation of extended components in different separable words and giving summary and detailed classification of the extended components on the large-scale corpus, a new approach is designed based on the words segmentation and the structure type of extended component. According to the experiments of identification mark to separable words of verb-object type, the average recall is 89.54% and the average precision is 87.43% in open test. The experimental results show that the method is effective.
Mathematics,Linguistics,Computer Science