CorefDPR: A Joint Model for Coreference Resolution and Dropped Pronoun Recovery in Chinese Conversations
Jingxuan Yang,Si Li,Sheng Gao,Jun Guo
DOI: https://doi.org/10.1109/taslp.2022.3140545
2022-01-01
IEEE/ACM Transactions on Audio Speech and Language Processing
Abstract:In this work, we present that coreference resolution and dropped pronoun recovery are two strongly related tasks in Chinese conversations, as recovering the dropped pronoun needs to explore the referent of the pronoun at first. Meanwhile, the omitted entity mention should be recovered before its coreferences are resolved. This motivates us to propose CorefDPR, a novel model to jointly resolve these two tasks and make them enhance each other. CorefDPR firstly utilizes a pre-trained language model to encode tokens in the conversation snippet. Then, the coreference resolution layer detects all entity mentions from the candidate text spans and groups them as coreferent mention clusters based on the contextualized token states. Furthermore, the pronoun recovery layer explores the referent of each dropped pronoun from the coreferent mention clusters and predicts the probability distribution over pronoun category for each token. Finally, a general conditional random fields (GCRF) is employed to globally optimize the pronoun recovery sequence of the snippet by modeling both intra-utterance and cross-utterance pronoun dependencies, and the recovered pronouns are further linked back to corresponding mention clusters to complete them. Experimental results on the benchmark demonstrate that our proposed model outperformed the state-of-the-art baselines of both these two tasks, and the exploratory experiments also demonstrate that these two tasks mutually benefit each other.