AMTN: Attention-Enhanced Multimodal Temporal Network for Humor Detection

Yangyang Xu,Peng Zou,Rui Wang,Qi Li,Chengpeng Xu,Zhuoer Zhao,Xun Yang,Xiao Sun,Dan Guo,Meng Wang
DOI: https://doi.org/10.1145/3689062.3689375
2024-01-01
Abstract:In this paper, we introduce the Attention-Enhanced Multimodal Temporal Network (AMTN) to address the MuSe 2024 Cross-Cultural Humor Detection Sub-Challenge (MuSe-Humor), which highlights the task of humor detection within a cross-cultural context, leveraging multimodal information for its accomplishment. Specifically, we employ the Temporal Convolutional Network (TCN) to capture the temporal dynamics within individual modalities' features. Following this, we apply attention mechanism to refine the integration of information across different modalities and temporal sequences. The integrated features are then used for humor detection. Furthermore, we investigate the effectiveness of an end-to-end approach for this challenge. Finally, a more robust outcome is achieved by aggregating multiple experimental results, which constitutes our final submission for the challenge. As a result, our solution achieves a remarkable AUC score of 0.8833 on the test dataset, outperforming the baseline by 0.0151 and securing 2nd place in the MuSe-Humor sub-challenge.
What problem does this paper attempt to address?