Overview of the NLPCC 2023 Shared Task 10: Learn to Watch TV: Multimodal Dialogue Understanding and Response Generation.

Yueqian Wang,Yuxuan Wang,Dongyan Zhao
DOI: https://doi.org/10.1007/978-3-031-44699-3_37
2023-01-01
Abstract:In this paper, we present an overview of NLPCC 2023 Shared Task 10, Multimodal Dialogue Understanding and Response Generation, which includes four sub-tasks: dialogue scene identification, dialogue session identification, dialogue response retrieval, and dialogue response generation. A bilingual multi-modal dialogue dataset consisting of 100M utterances was made public for the shared task. This dataset contains 119K dialogue scene boundaries and 62K dialogue session boundaries annotated manually. This paper presents details of this shared task, dataset, evaluation metric and evaluation results.
What problem does this paper attempt to address?