Abstract:Internet memes, while often humorous in nature, can be used to spread hate speech, toxic content, and disinformation across the digital information ecosystem. As a result, detecting harmful memes has become a crucial task for maintaining online safety and fostering responsible online behavior. Prior research in this field has mainly targeted multimodal internal aspects of memes, specifically the image and text modalities, and has sought to interpret their significance by analyzing intra- and inter-modality signals via sophisticated visual-language models. However, understanding the message of a (harmful) meme entails tacit background knowledge, which is not explicitly expressed in the meme itself, but rather relies on cultural references, shared knowledge, and social context. In this paper, we propose KERMIT (Knowledge-EmpoweRed Model In harmful meme deTection), a novel framework which incorporates and uses external knowledge into the process of identifying harmful memes. Specifically, KERMIT builds the meme's knowledge-enriched information network by integrating internal entities of the meme with relevant external knowledge obtained from ConceptNet. Subsequently, the framework employs a dynamic learning mechanism that leverages memory-augmented neural networks and attention mechanisms to discern the most informative knowledge for accurate classification of harmful memes. Our experiments on four benchmark datasets demonstrate that KERMIT effectively utilizes external knowledge to improve classification performance compared to several state-of-the-art baselines. Overall, the findings of this study shed light on the complex nature of Internet memes and highlight the importance of knowledge-informed decision-making for harmful meme detection.

KERMIT: Knowledge-EmpoweRed model in harmful meme deTection

Just KIDDIN: Knowledge Infusion and Distillation for Detection of INdecent Memes

Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models

MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets

Detecting and Understanding Harmful Memes: A Survey

Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge

MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention

Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models

A Multimodal Framework for the Detection of Hateful Memes

Contextualizing Internet Memes Across Social Media Platforms

OSPC: Detecting Harmful Memes with Large Language Model as a Catalyst

PolyMeme: Fine-Grained Internet Meme Sensing

Detecting hate speech in memes: a review

Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations

Towards Comprehensive Detection of Chinese Harmful Memes

On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning

GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse

MATK: The Meme Analytical Tool Kit

MemeGraphs: Linking Memes to Knowledge Graphs

Capturing Pertinent Symbolic Features for Enhanced Content-Based Misinformation Detection

MemeFier: Dual-stage Modality Fusion for Image Meme Classification