An Improved Hierarchical Neural Network Model with Local and Global Feature Matching for Script Event Prediction

Pengpeng Zhou,Bin Wu,Caiyong Wang,Longzhu He
DOI: https://doi.org/10.1016/j.eswa.2024.125325
2025-01-01
Abstract:Script event prediction aims to predict subsequent events based on given context event sequences, requiring a deep understanding of script contexts. Many hierarchical neural network models have been proposed for this purpose. However, existing models often neglect high-level context modeling, while primarily focusing on low-level event arguments and middle-level event relations. To address this issue, we propose an improved hierarchical neural network model that integrates local and global feature matching for script event prediction. First, we extract low-level and middle-level event features using an event encoding layer and an event relation layer, respectively. In particular, the event encoding layer employs stacked transformers with learnable positional encoding to comprehensively capture the connections between event arguments. Then instead of previous local score-level event matching, we further introduce a novel feature-level event matching layer to globally match the context event chain and candidate events. This layer matches the candidate event associated with each context event based on multiple dimensionless similarity measures and aggregates these local matching vectors into a global matching vector using an attention mechanism, thereby capturing holistic contextual information. Finally, the global matching vector is fed into an event prediction layer comprising a classification network (e.g., a multi-layered perceptron) to compute the relatedness score. By stacking four bottom-up layers, our model learns multi-level event interactions and deeply understands the script context, resulting in more accurate event predictions. Experimental results on the New York Times corpus demonstrate that our model outperforms state-of-the-art baselines, while being computationally efficient, stable, and quick to converge during training.
What problem does this paper attempt to address?