Abstract:Event coreference resolution(ECR) is an important task in Natural Language Processing (NLP) and nearly all the existing approaches to this task rely on event argument information. However, these methods tend to suffer from error propagation from the stage of event argument extraction. Besides, not every event mention contains all arguments of an event, and argument information may confuse the model that events have arguments to detect event coreference in real text. Furthermore, the context information of an event is useful to infer the coreference between events. Thus, in order to reduce the errors propagated from event argument extraction and use context information effectively, we propose a multi-loss neural network model that does not need any argument information to do the within-document event coreference resolution task and achieve a significant performance than the state-of-the-art methods.
What problem does this paper attempt to address?
The problems that this paper attempts to solve are as follows: In the Event Coreference Resolution (ECR) task in Natural Language Processing (NLP), existing methods rely on event argument information, which leads to errors propagated from the event argument extraction stage, and not every event mention contains all argument information, thus potentially confusing the model's judgment of event coreference. In addition, existing methods fail to fully utilize context information to infer the coreference relationship between events.
To solve these problems, the author proposes a Multi - loss Neural Network (MLNN) model, which does not use any event argument information to perform the in - document event coreference resolution task. Specifically, the MLNN model aims to reduce the error propagated due to event argument extraction errors and utilize context information more effectively. Through this method, the author hopes to significantly improve the performance of event coreference resolution, surpassing the existing state - of - the - art methods.
### Formula Summary
1. **Cross - Entropy Loss Function** (for Classifier Network CN):
\[
L_1(\theta_1)=-\frac{1}{n}\sum_{i = 0}^{n}[y_i\log\hat{y}_i+(1 - y_i)\log(1 - \hat{y}_i)]
\]
where
\[
\hat{y}_i = P(y_i = 1|x_i),\quad1 - \hat{y}_i = P(y_i = 0|x_i)
\]
2. **Similarity Difference Loss Function** (for Scorer Network SN):
\[
L_2(\theta_2)=\sum_{i = 0}^{n}\log|m_i - s_i|
\]
where
\[
m_i=\begin{cases}
1, & \text{if }y_i = 0\\
- 1, & \text{if }y_i = 1
\end{cases}
\]
\(s_i\) is the cosine similarity score of the two events in the input event pair.
3. **Jointly - Trained Multi - Loss Function**:
\[
L_{\text{all}}(\theta_{\text{all}})=L_1(\theta_1)+L_2(\theta_2)
\]
### Method Overview
- **Event Mention Extraction**: It is achieved through a multi - layer Feed - forward Neural Network (FNN), using candidate words, words within the context window, POS tags and stems as features.
- **Event Coreference Detection**: A multi - loss neural network (MLNN) is constructed, including a classifier network (CN) and a scorer network (SN). The classifier network is used to predict whether event pairs are coreferential, and the scorer network is used to calculate the similarity score of event pairs.
- **Event Clustering**: According to the classification and scoring results, the event mentions are clustered into event chains using a dynamic connectivity algorithm.
### Experimental Results
The experiment was carried out on the ECB+ corpus, and the results show that the MLNN model significantly outperforms the existing state - of - the - art methods in multiple evaluation metrics (such as B3, MUC, CEAF - e, CoNLL - F1), especially achieving a significant improvement in the CoNLL - F1 metric.
### Conclusion and Future Work
The method proposed by the author significantly improves the performance of event coreference resolution by not relying on event argument information and effectively using context information. Future work will attempt to design a joint model to jointly complete the event extraction, argument extraction and event coreference resolution tasks to further reduce error propagation.