Abstract:Over the years, social media has emerged as one of the most popular platforms where people express their views and share thoughts about various aspects. The social media content now includes a variety of components such as text, images, videos etc. One type of interest is memes, which often combine text and images. It is relevant to mention here that, social media being an unregulated platform, sometimes also has instances of discriminatory, offensive and hateful content being posted. Such content adversely affects the online well-being of the users. Therefore, it is very important to develop computational models to automatically detect such content so that appropriate corrective action can be taken. Accordingly, there have been research efforts on automatic detection of such content focused mainly on the texts. However, the fusion of multimodal data (as in memes) creates various challenges in developing computational models that can handle such data, more so in the case of low-resource languages. Among such challenges, the lack of suitable datasets for developing computational models for handling memes in low-resource languages is a major problem. This work attempts to bridge the research gap by providing a large-sized curated dataset comprising 5,054 memes in Hindi-English code-mixed language, which are manually annotated by three independent annotators. It comprises two subtasks: (i) Subtask-1 (Binary classification involving tagging a meme as misogynous or non-misogynous), and (ii) Subtask-2 (multi-label classification of memes into different categories). The data quality is evaluated by computing Krippendorff's alpha. Different computational models are then applied on the data in three settings: text-only, image-only, and multimodal models using fusion techniques. The results show that the proposed multimodal method using the fusion technique may be the preferred choice for the identification of misogyny in multimodal Internet content and that the dataset is suitable for advancing research and development in the area.

MIMIC: Misogyny Identification in Multimodal Internet Content in Hindi-English Code-Mixed Language

Hate Me Not: Detecting Hate Inducing Memes in Code Switched Languages

M3Hop-CoT: Misogynous Meme Identification with Multimodal Multi-hop Chain-of-Thought

Multimodal sentiment analysis of english and hinglish memes

Benchmark dataset of memes with text transcriptions for automatic detection of multi-modal misogynistic content

Misogynistic Meme Detection using Early Fusion Model with Graph Network

A Multimodal Framework for the Detection of Hateful Memes

TIB-VA at SemEval-2022 Task 5: A Multimodal Architecture for the Detection and Classification of Misogynous Memes

Multimodal Hate Speech Detection in Memes Using Contrastive Language-Image Pre-Training

Memotion 3: Dataset on Sentiment and Emotion Analysis of Codemixed Hindi-English Memes

Two-Way Feature Extraction Using Sequential and Multimodal Approach for Hateful Meme Classification

A context-aware attention and graph neural network-based multimodal framework for misogyny detection

MIMIC: Multimodal Islamophobic Meme Identification and Classification

Developing a Multilingual Annotated Corpus of Misogyny and Aggression

Overview of Memotion 3: Sentiment and Emotion Analysis of Codemixed Hinglish Memes

Deciphering Hate: Identifying Hateful Memes and Their Targets

Multimodal Hate Speech Detection from Bengali Memes and Texts

Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language

Multimodal Deep Learning with Discriminant Descriptors for Offensive Memes Detection

Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations

Do Images really do the Talking? Analysing the significance of Images in Tamil Troll meme classification