Towards Comprehensive Detection of Chinese Harmful Memes

Junyu Lu,Bo Xu,Xiaokun Zhang,Hongbo Wang,Haohao Zhu,Dongyu Zhang,Liang Yang,Hongfei Lin
2024-10-03
Abstract:This paper has been accepted in the NeurIPS 2024 D & B Track. Harmful memes have proliferated on the Chinese Internet, while research on detecting Chinese harmful memes significantly lags behind due to the absence of reliable datasets and effective detectors. To this end, we focus on the comprehensive detection of Chinese harmful memes. We construct ToxiCN MM, the first Chinese harmful meme dataset, which consists of 12,000 samples with fine-grained annotations for various meme types. Additionally, we propose a baseline detector, Multimodal Knowledge Enhancement (MKE), incorporating contextual information of meme content generated by the LLM to enhance the understanding of Chinese memes. During the evaluation phase, we conduct extensive quantitative experiments and qualitative analyses on multiple baselines, including LLMs and our MKE. The experimental results indicate that detecting Chinese harmful memes is challenging for existing models while demonstrating the effectiveness of MKE. The resources for this paper are available at <a class="link-external link-https" href="https://github.com/DUT-lujunyu/ToxiCN_MM" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of detecting harmful Chinese memes. Specifically, it focuses on various types of harmful memes that appear in the Chinese internet environment. These memes may target specific social entities or may not target specific objects but still propagate negative values. Due to the lack of reliable Chinese datasets and effective detectors, research on harmful Chinese memes is relatively lagging. To address these issues, the paper makes the following contributions: 1. **Constructing the TOXICN MM Dataset**: This is the first Chinese harmful meme dataset, containing 12,000 samples with detailed annotations of various types of harmful memes. The dataset includes not only harmful memes targeting specific objects but also those that, although not having a clear target, still possess potential toxicity. 2. **Proposing the Multimodal Knowledge Enhancement (MKE) Detector**: To improve the detector's understanding of harmful Chinese memes, the paper proposes a baseline detector—Multimodal Knowledge Enhancement (MKE), which enhances the understanding of meme content by integrating contextual information generated by large language models. 3. **Experimental Evaluation**: During the experimental phase, the paper conducts extensive quantitative experiments and qualitative analyses on various benchmark models, including traditional pre-trained language models and large language models, validating the effectiveness of MKE. Through the above work, the paper demonstrates the challenges existing models face in detecting harmful Chinese memes and proves the effectiveness of MKE in this task.