KERMIT: Knowledge-EmpoweRed model in harmful meme deTection

Biagio Grasso,Valerio La Gatta,Vincenzo Moscato,Giancarlo Sperlì
DOI: https://doi.org/10.1016/j.inffus.2024.102269
IF: 18.6
2024-01-28
Information Fusion
Abstract:Internet memes, while often humorous in nature, can be used to spread hate speech, toxic content, and disinformation across the digital information ecosystem. As a result, detecting harmful memes has become a crucial task for maintaining online safety and fostering responsible online behavior. Prior research in this field has mainly targeted multimodal internal aspects of memes, specifically the image and text modalities, and has sought to interpret their significance by analyzing intra- and inter-modality signals via sophisticated visual-language models. However, understanding the message of a (harmful) meme entails tacit background knowledge, which is not explicitly expressed in the meme itself, but rather relies on cultural references, shared knowledge, and social context. In this paper, we propose KERMIT (Knowledge-EmpoweRed Model In harmful meme deTection), a novel framework which incorporates and uses external knowledge into the process of identifying harmful memes. Specifically, KERMIT builds the meme's knowledge-enriched information network by integrating internal entities of the meme with relevant external knowledge obtained from ConceptNet. Subsequently, the framework employs a dynamic learning mechanism that leverages memory-augmented neural networks and attention mechanisms to discern the most informative knowledge for accurate classification of harmful memes. Our experiments on four benchmark datasets demonstrate that KERMIT effectively utilizes external knowledge to improve classification performance compared to several state-of-the-art baselines. Overall, the findings of this study shed light on the complex nature of Internet memes and highlight the importance of knowledge-informed decision-making for harmful meme detection.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?