Multimodal Chinese Event Extraction on Text and Audio.

Xinlang Zhang,Zhongqing Wang,Peifeng Li
DOI: https://doi.org/10.1109/IJCNN54540.2023.10191258
2023-01-01
Abstract:Previous work on event extraction mainly focused on text modality. With the deepening of multimodal research in recent years, there are a few studies on multimodal event extraction and most of them aimed at bimodal fusion of texts and images, where images can provide evidences to improve text-based methods. However, among the multimodal methods for event extraction, there are few studies using audio modality. In fact, audio contains its own effective information, which is helpful for event extraction. Therefore, this paper proposes a novel multimodal event extraction model on text and audio modalities. Experimental results on the Chinese ACE 2005 dataset show that the proposed model can effectively improve the performance of event extraction using audio features.
What problem does this paper attempt to address?