Automatic Recognition of Verb-Complement Separable Words Based on BCC

Yuxin Peng,Jian Wu
DOI: https://doi.org/10.1007/978-3-031-28956-9_30
2023-01-01
Abstract:This research is aimed to summarize rules to automatically recognize verb-complement separable words from the perspective of information processing. First, the research based on the BLCU Corpus Center (Abbreviated BCC throughout), uses its retrieval function for verb-complement separable words to perform an exhaustive search for all of them in The Contemporary Chinese Dictionary, and calculate their Positive Rate of identification. Second, summarize and count the internal insertion components of verb-complement separable words in tens of thousands of corpus, observe the before and after components, and conclude them into the corresponding separated form. Research points out the reasons why the corpus recognizes them as verb-complement separable words incorrectly: the result of the machine’s segmentation is wrong, the separated form of verb-object separable words is recognized as a separated form of verb-complement separable words, a morpheme in a compound directional verb is fetched alone, two independent verbs are seen as a separated form, the overlapping usage of verb-complement separable words can’t be recognized accurately, and the ability to deal with polysemous and polysyllabic words is not strong. Finally, transform the separated form from the perspective of linguistics into the corresponding rules that can be recognized by the computer, and write the program for experimentation. This program can make further judgments based on the separable word retrieval of the corpus itself, and realize that the recognition accuracy of these separated words can be increased to more than 90%.
What problem does this paper attempt to address?