"A good pun is its own reword": Can Large Language Models Understand Puns?

Zhijun Xu,Siyu Yuan,Lingjie Chen,Deqing Yang
2024-06-16
Abstract:Puns play a vital role in academic research due to their distinct structure and clear definition, which aid in the comprehensive analysis of linguistic humor. However, the understanding of puns in large language models (LLMs) has not been thoroughly examined, limiting their use in creative writing and humor creation. In this paper, we leverage three popular tasks, i.e., pun recognition, explanation and generation to systematically evaluate the capabilities of LLMs in pun understanding. In addition to adopting the automated evaluation metrics from prior research, we introduce new evaluation methods and metrics that are better suited to the in-context learning paradigm of LLMs. These new metrics offer a more rigorous assessment of an LLM's ability to understand puns and align more closely with human cognition than previous metrics. Our findings reveal the "lazy pun generation" pattern and identify the primary challenges LLMs encounter in understanding puns.
Computation and Language
What problem does this paper attempt to address?
This paper attempts to systematically evaluate the ability of large language models (LLMs) in understanding puns. Although LLMs have been widely studied in various tasks of natural language understanding and generation, their ability to understand puns has not been systematically explored yet, which limits the application of LLMs in creative writing and humor creation. To fill this gap, the paper evaluates the performance of LLMs in this regard through three main tasks - pun recognition, pun explanation and pun generation - and introduces some new evaluation methods and metrics to adapt to the in - context learning paradigm of LLMs, so as to more strictly evaluate the ability of LLMs to understand puns and make it more consistent with human cognition. The main contributions of the paper include: - Systematically evaluating the ability of LLMs to understand puns for the first time. - Proposing several novel evaluation methods and metrics, such as double - bias prompt query, punchline check and overlap metric, for evaluating the originality of pun generation. - Through extensive experiments, providing detailed and in - depth analysis results, revealing the main difficulties of LLMs in understanding puns and providing useful insights for future research.