What problem does this paper attempt to address?

The problem that this paper attempts to solve is Event Factuality Prediction (EFP) in natural language processing. Specifically, the EFP task involves annotating the factuality of the event referred to by the phrase (or its headword) representing the event (i.e., whether the event occurred or not). The paper proposes two neural network models and their variants to significantly improve the performance on three existing event factuality datasets (FactBank, UW, and MEANTIME), and extends the "It Happened" part of the Universal Decompositional Semantics (UDS) dataset, forming the largest event factuality dataset currently available. In addition, the paper also explores the effectiveness of multi - task training and model integration. ### Models and Methods 1. **Neural Network Models**: - **Stacked Bidirectional Linear Chain LSTM (Linear Chain Bidirectional LSTM)**: Capture context information through the standard stacked bidirectional linear chain LSTM model. - **Stacked Bidirectional Child - Sum Dependency Tree LSTM (Tree - structure Bidirectional LSTM)**: Capture context information, especially the interaction between internal and external contexts, through the bidirectional LSTM model with a dependency tree structure. 2. **Regression Model**: - Use the last - layer hidden state of the LSTM model as input, and predict the factuality value \(\hat{v}_t\) of the event through a two - layer regression model. 3. **Multi - task Training**: - **Single - task Specific**: Train a model instance separately for each dataset. - **Single - task General**: Jointly train all datasets on one model instance. - **Multi - task Simple**: Similar to single - task general, but maintain independent regression parameters for each dataset. - **Multi - task Balanced**: On the basis of multi - task simple, upsample small datasets to ensure that samples of each dataset are seen with the same frequency. - **Multi - task Focused**: Upsample a specific target dataset to ensure that its samples are seen with a 50% frequency, and samples of other datasets are evenly distributed. ### Datasets - **FactBank**: Based on the TimeBank corpus, contains annotations of event factuality. - **UW**: An event factuality dataset with crowdsourced annotations, based on TempEval - 3 data. - **MEANTIME**: A smaller dataset that contains similar discrete factuality annotations. - **UDS - IH2**: An extension of the UDS - IH1 dataset, covering the entire English Universal Dependencies v1.2 treebank, becoming the largest event factuality dataset currently available. ### Experimental Results - The paper evaluates the model performance under five experimental settings, including single - task and multi - task training. - The results show that the LSTM model with a linear - chain structure outperforms the LSTM model with a tree - structure in most cases. - Multi - task training and model integration further improve the performance, especially on larger datasets. ### Main Contributions 1. Proposed two new neural network models, which significantly improve the performance of event factuality prediction. 2. Extended the existing event factuality datasets, forming the largest dataset currently available. 3. Explored the effectiveness of multi - task training and model integration, providing references for future research. Through these methods and experiments, the paper provides new solutions and technical paths for the event factuality prediction task in natural language processing.

Neural models of factuality

Lexicosyntactic Inference in Neural Models

Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity

Exploring Factual Entailment with NLI: A News Media Study

Models See Hallucinations: Evaluating the Factuality in Video Captioning

Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation

Annotating and Modeling Fine-grained Factuality in Summarization

End-to-end event factuality prediction using directional labeled graph recurrent network

Neural models for Factual Inconsistency Classification with Explanations

Factuality Enhanced Language Models for Open-Ended Text Generation

Evaluating the Tradeoff Between Abstractiveness and Factuality in Abstractive Summarization

Findings of Factify 2: Multimodal Fake News Detection

Language Models Hallucinate, but May Excel at Fact Verification

Logically at Factify 2022: Multimodal Fact Verification

He Thinks He Knows Better than the Doctors: BERT for Event Factuality Fails on Pragmatics

INO at Factify 2: Structure Coherence based Multi-Modal Fact Verification

Improving Model Factuality with Fine-grained Critique-based Evaluator

Are Factuality Checkers Reliable? Adversarial Meta-evaluation of Factuality in Summarization

Explaining Veracity Predictions with Evidence Summarization: A Multi-Task Model Approach

ChronoFact: Timeline-based Temporal Fact Verification

Enhancing Fact Retrieval in PLMs through Truthfulness