Abstract:Over the years, social media has emerged as one of the most popular platforms where people express their views and share thoughts about various aspects. The social media content now includes a variety of components such as text, images, videos etc. One type of interest is memes, which often combine text and images. It is relevant to mention here that, social media being an unregulated platform, sometimes also has instances of discriminatory, offensive and hateful content being posted. Such content adversely affects the online well-being of the users. Therefore, it is very important to develop computational models to automatically detect such content so that appropriate corrective action can be taken. Accordingly, there have been research efforts on automatic detection of such content focused mainly on the texts. However, the fusion of multimodal data (as in memes) creates various challenges in developing computational models that can handle such data, more so in the case of low-resource languages. Among such challenges, the lack of suitable datasets for developing computational models for handling memes in low-resource languages is a major problem. This work attempts to bridge the research gap by providing a large-sized curated dataset comprising 5,054 memes in Hindi-English code-mixed language, which are manually annotated by three independent annotators. It comprises two subtasks: (i) Subtask-1 (Binary classification involving tagging a meme as misogynous or non-misogynous), and (ii) Subtask-2 (multi-label classification of memes into different categories). The data quality is evaluated by computing Krippendorff's alpha. Different computational models are then applied on the data in three settings: text-only, image-only, and multimodal models using fusion techniques. The results show that the proposed multimodal method using the fusion technique may be the preferred choice for the identification of misogyny in multimodal Internet content and that the dataset is suitable for advancing research and development in the area.

MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched Contextualization

MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing

Comprehending the Gossips: Meme Explanation in Time-Sync Video Comment via Multimodal Cues

Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations

Contextualizing Internet Memes Across Social Media Platforms

Exercise? I thought you said 'Extra Fries': Leveraging Sentence Demarcations and Multi-hop Attention for Meme Affect Analysis

Meme Sentiment Analysis Enhanced with Multimodal Spatial Encoding and Facial Embedding

What Do They “meme”? A Metaphor-Aware Multi-Modal Multi-Task Framework for Fine-Grained Meme Understanding

MemeSequencer: Sparse Matching for Embedding Image Macros

Generating Multimodal Metaphorical Features for Meme Understanding

Overview of Memotion 3: Sentiment and Emotion Analysis of Codemixed Hinglish Memes

Explainable Multimodal Sentiment Analysis on Bengali Memes

SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech Detection in Memes

CEFM: CLIP Encoded Fusion Model for multimodal humor recognition on memes

IITK at SemEval-2024 Task 4: Hierarchical Embeddings for Detection of Persuasion Techniques in Memes

Multimodal Analysis of memes for sentiment extraction

Memotion 3: Dataset on Sentiment and Emotion Analysis of Codemixed Hindi-English Memes

MET-Meme: a Multimodal Meme Dataset Rich in Metaphors

MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets

Emotion-Aware Multimodal Fusion for Meme Emotion Detection

MIMIC: Misogyny Identification in Multimodal Internet Content in Hindi-English Code-Mixed Language