Chumor 2.0: Towards Benchmarking Chinese Humor Understanding

Ruiqi He,Yushu He,Longju Bai,Jiarui Liu,Zhenjie Sun,Zenghao Tang,He Wang,Hanchen Xia,Rada Mihalcea,Naihao Deng
2024-12-24
Abstract:Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at <a class="link-external link-https" href="https://huggingface.co/datasets/dnaihao/Chumor" rel="external noopener nofollow">this https URL</a>, our project page is at <a class="link-external link-https" href="https://dnaihao.github.io/Chumor-dataset/" rel="external noopener nofollow">this https URL</a>, our leaderboard is at <a class="link-external link-https" href="https://huggingface.co/spaces/dnaihao/Chumor" rel="external noopener nofollow">this https URL</a>, and our codebase is at <a class="link-external link-https" href="https://github.com/dnaihao/Chumor-dataset" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?