Abstract:Sarcasm is a linguistic phenomenon indicating a difference between literal meanings and implied intentions. It is commonly used on blogs, e-commerce platforms, and social media. Numerous NLP tasks, such as opinion mining and sentiment analysis systems, are hampered by its linguistic nature in detection. Traditional techniques concentrated mostly on textual incongruity. Recent research demonstrated that the addition of commonsense knowledge into sarcasm detection is an effective new method. However, existing techniques cannot effectively capture sentence “incongruity” information or take good advantage of external knowledge, resulting in imperfect detection performance. In this work, new modules are proposed for maximizing the utilization of the text, the commonsense knowledge, and their interplay. At first, we propose an adaptive incongruity extraction module to compute the distance between each word in the text and commonsense knowledge. Two adaptive incongruity extraction modules are applied to text and commonsense knowledge, respectively, which can obtain two adaptive incongruity attention matrixes. Therefore, each of the words in the sequence receives a new representation with enhanced incongruity semantics. Secondly, we propose the incongruity cross-attention module to extract the incongruity between the text and the corresponding commonsense knowledge, thereby allowing us to pick useful commonsense knowledge in sarcasm detection. In addition, we propose an improved gate module as a feature fusion module of text and commonsense knowledge, which determines how much information should be considered. Experimental results on publicly available datasets demonstrate the superiority of our method in achieving state-of-the-art performance on three datasets as well as enjoying improved interpretability.

Modeling Incongruity Between Modalities for Multimodal Sarcasm Detection

Dual-level adaptive incongruity-enhanced model for multimodal sarcasm detection

An attention-based, context-aware multimodal fusion method for sarcasm detection using inter-modality inconsistency

Sarcasm Detection of Dual Multimodal Contrastive Attention Networks.

A Semantic Enhancement Framework for Multimodal Sarcasm Detection

Mutual-Enhanced Incongruity Learning Network for Multi-Modal Sarcasm Detection

Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection

MIAN: Multi-head Incongruity Aware Attention Network with Transfer Learning for Sarcasm Detection

Multi-Modal Sarcasm Detection with Sentiment Word Embedding

Multi-Modal Sarcasm Detection Based on Contrastive Attention Mechanism

Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement

Sarcasm driven by sentiment: A sentiment-aware hierarchical fusion network for multimodal sarcasm detection

CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models

Multi-View Incongruity Learning for Multimodal Sarcasm Detection

Knowledge-Enhanced Multi-perspective Incongruity Perception Network for Multimodal Sarcasm Detection

AMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentation

DIP: Dual Incongruity Perceiving Network for Sarcasm Detection

Sarcasm Detection Base on Adaptive Incongruity Extraction Network and Incongruity Cross-Attention

Multi-modal sarcasm detection based on emotion perception and cross-modality attention fusion

Multi-Modal Sarcasm Detection In Twitter With Hierarchical Fusion Model

Fusion and Discrimination: A Multimodal Graph Contrastive Learning Framework for Multimodal Sarcasm Detection