CLaM: an Open-Source Library for Performance Evaluation of Text-driven Human Motion Generation

Xiaodong Chen,Kunlang He,Wu Liu,Xinchen Liu,Zheng-Jun Zha,Tao Mei
DOI: https://doi.org/10.1145/3664647.3685523
2024-01-01
Abstract:Text-driven human motion generation, which creates motion sequences based on textual descriptions, has attracted great attention in the communities of multimedia and artificial intelligence. By parsing and comprehending textual information and converting it into specific human movements, it realizes a direct transformation from human semantics to motion sequences. New text-driven human motion generators are springing up to achieve better performance. However, the absence of well-trained evaluators that can effectively estimate the consistency between the text prompts and motions generated by existing generators remains a challenge. To address the above issues, we propose an open-source library with a powerful Contrastive Language-and-Motion (CLaM) pre-training evaluator, which can be employed for evaluating a variety of text-driven human motion generation algorithms. We perform a thorough performance evaluation of the existing algorithms on various metrics, such as R-Precision. As a by-product, we build a large-scale HumanML3D-synthesis dataset, which consists of 14,616 motion sequences and 547,102 textual descriptions, which is ten times larger than the widely-used HumanML3D dataset. The source codes and models for CLaM are available at~https://github.com/SheldongChen/CLaM/.
What problem does this paper attempt to address?