SkatingVerse: A large‐scale benchmark for comprehensive evaluation on human action understanding
Ziliang Gan,Lei Jin,Yi Cheng,Yu Cheng,Yinglei Teng,Zun Li,Yawen Li,Wenhan Yang,Zheng Zhu,Junliang Xing,Jian Zhao
DOI: https://doi.org/10.1049/cvi2.12287
IF: 1.484
2024-05-31
IET Computer Vision
Abstract:SkatingVerse, a comprehensive benchmark for human action understanding (HAU) tasks is introduced, including action recognition, segmentation, proposal, and assessment. By leveraging figure skating as the task object, SkatingVerse overcomes biases in previous datasets and provides a large‐scale and diverse resource for studying HAU foundation models. The experimental results demonstrate the effectiveness of SkatingVerse in improving HAU methods and its potential for facilitating practical applications in the field. Human action understanding (HAU) is a broad topic that involves specific tasks, such as action localisation, recognition, and assessment. However, most popular HAU datasets are bound to one task based on particular actions. Combining different but relevant HAU tasks to establish a unified action understanding system is challenging due to the disparate actions across datasets. A large‐scale and comprehensive benchmark, namely SkatingVerse is constructed for action recognition, segmentation, proposal, and assessment. SkatingVerse focus on fine‐grained sport action, hence figure skating is chosen as the task object, which eliminates the biases of the object, scene, and space that exist in most previous datasets. In addition, skating actions have inherent complexity and similarity, which is an enormous challenge for current algorithms. A total of 1687 official figure skating competition videos was collected with a total of 184.4 h, exceeding four times over other datasets with a similar topic. SkatingVerse enables to formulate a unified task to output fine‐grained human action classification and assessment results from a raw figure skating competition video. In addition, SkatingVerse can facilitate the study of HAU foundation model due to its large scale and abundant categories. Moreover, image modality is incorporated for human pose estimation task into SkatingVerse. Extensive experimental results show that (1) SkatingVerse significantly helps the training and evaluation of HAU methods, (2) the performance of existing HAU methods has much room to improve, and SkatingVerse helps to reduce such gaps, and (3) unifying relevant tasks in HAU through a uniform dataset can facilitate more practical applications. SkatingVerse will be publicly available to facilitate further studies on relevant problems.
computer science, artificial intelligence,engineering, electrical & electronic