Seq2seq is All You Need for Coreference Resolution

Wenzheng Zhang,Sam Wiseman,Karl Stratos
DOI: https://doi.org/10.48550/arXiv.2310.13774
2023-10-21
Abstract:Existing works on coreference resolution suggest that task-specific models are necessary to achieve state-of-the-art performance. In this work, we present compelling evidence that such models are not necessary. We finetune a pretrained seq2seq transformer to map an input document to a tagged sequence encoding the coreference annotation. Despite the extreme simplicity, our model outperforms or closely matches the best coreference systems in the literature on an array of datasets. We also propose an especially simple seq2seq approach that generates only tagged spans rather than the spans interleaved with the original text. Our analysis shows that the model size, the amount of supervision, and the choice of sequence representations are key factors in performance.
Computation and Language
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper attempts to demonstrate that for the task of coreference resolution, it is not necessary to design task-specific models to achieve state-of-the-art performance. Specifically, the authors propose a method based on a pre-trained sequence-to-sequence (seq2seq) transformer that maps input documents to annotated sequences to encode coreference annotations. Despite the simplicity of the method, their model surpasses or approaches the best coreference systems on multiple datasets. ### Main Contributions 1. **Simplified Model**: The authors show that by fine-tuning a pre-trained seq2seq transformer, it is possible to achieve performance comparable to or better than existing best coreference systems without any architectural modifications. 2. **Flexible Target Representation**: The authors propose various target sequence representations, including full linearization and partial linearization, and find significant performance differences among these methods. 3. **Integer-Free Representation**: To avoid using a large number of integer labels in the document, the authors introduce a new "integer-free" representation, generating correct coreference clusters through a post-processing step. 4. **Alignment Issue**: For the alignment issue in partial linearization, the authors propose an efficient alignment algorithm to ensure that the predicted mentions correctly correspond to their actual positions in the input document. ### Experimental Results 1. **English OntoNotes**: Using the T0 model with 3B parameters, the authors' seq2seq model achieved an average F1 score of 82.9 on the OntoNotes test set, surpassing existing non-seq2seq models and some seq2seq models. 2. **PreCo and LitBank**: On the PreCo dataset, the authors' model achieved an average F1 score of 88.5, outperforming all previous work. On the smaller LitBank dataset, the model also showed competitive performance, with significant improvements when jointly trained. ### Conclusion This paper experimentally demonstrates that for the task of coreference resolution, standard seq2seq models can achieve or even exceed the performance of existing task-specific models, questioning the necessity of developing task-specific solutions. The authors' method is not only simple but also performs excellently on multiple datasets, providing new directions for future research.