Resolution of Personal Pronoun Anaphora in Chinese Micro-blog
Yuanyuan Peng,Yangsen Zhang,Shujing Huang,Ruoyu Chen,Jianqing You
DOI: https://doi.org/10.1007/978-3-030-04015-4_51
2018-01-01
Abstract:Anaphora resolution plays an important role in Chinese micro-blog information mining. Based on the linguistic features of personal pronouns in Chinese micro-blog texts, this paper proposes a multi-strategy method for the resolution of personal pronoun anaphora. Firstly, according to part of speech tagging and named entity recognition, personal pronouns and their candidate antecedents are extracted from Chinese micro-blog texts, and the rules for judging the consistency between a personal pronoun and its antecedents in grammar, semantics, gender and singular-plural are established. The antecedents which are inconsistent with the personal pronoun in these four aspects are preliminarily filtered, and Candidate Set 1 of antecedents is obtained. Then, SVM is used to classify the antecedents in Candidate Set 1, and the antecedents which have certain anaphoric relations with the current personal pronoun are selected to construct Candidate Set 2 of antecedents. Finally, by combination of the four linguistic characteristics of grammatical role, co-occurrence relation, reference distance and appositive dependency, the best antecedent is found out from Candidate Set 2 through the priority selection policy. At the same time, a strategy of extending antecedent is provided to solve the problem that the antecedent of the pronoun can’t be found according to the above method. In this paper, the validity of the proposed method is verified by using NLPCC2013 micro-blog corpus as the experimental data set. The experimental results show that the F value of the proposed method is 91.7% in Chinese micro-blog texts.