Multimodal Cross-Lingual Features and Weight Fusion for Cross-Cultural Humor Detection

Heng Xie,Jizhou Cui,Yuhang Cao,Junjie Chen,Jianhua Tao,Cunhang Fan,Xuefei Liu,Zhengqi Wen,Heng Lu,Yuguang Yang,Zhao Lv,Yongwei Li
DOI: https://doi.org/10.1145/3606039.3613110
2023-01-01
Abstract:Sentiment analysis plays a crucial role in interpreting human interactions across different modalities. This paper addresses the Cross-Cultural Multilingual Humour Detection Sub-Challenge in the MuSe 2023 Multi-modal Sentiment Analysis Challenge, which aims to identify humor instances in cross-cultural situations using multimodal data. In this paper, we explore the unique window length processing of audio data to more accurately and fully capture humor information. Furthermore, we employ a data augmentation technique for text to bridge the gap between the train set and the test set. This augmentation involves converting German and English samples in the database to each other, as well as translating both languages into Chinese. These techniques aim to enhance the model's generalization across different languages and alleviate the scarcity of humor samples. For multimodal fusion, we use the late fusion technique applied to different Gated Recurrent Unit (GRU) models with modality-specific weights obtained through gradient descent. Experimental results demonstrate the effectiveness of our approach, achieving an Area Under the Curve (AUC) score of 0.872 on the MuSe-Humor test set. This performance ranked second place in the sub-challenge.
What problem does this paper attempt to address?