Abstract:We introduce HUMAN4D, a large and multimodal 4D dataset that contains a variety of human activities simultaneously captured by a professional marker-based MoCap, a volumetric capture and an audio recording system. By capturing 2 female and $2$ male professional actors performing various full-body movements and expressions, HUMAN4D provides a diverse set of motions and poses encountered as part of single- and multi-person daily, physical and social activities (jumping, dancing, etc.), along with multi-RGBD (mRGBD), volumetric and audio data. Despite the existence of multi-view color datasets captured with the use of hardware (HW) synchronization, to the best of our knowledge, HUMAN4D is the first and only public resource that provides volumetric depth maps with high synchronization precision due to the use of intra- and inter-sensor HW-SYNC. Moreover, a spatio-temporally aligned scanned and rigged 3D character complements HUMAN4D to enable joint research on time-varying and high-quality dynamic meshes. We provide evaluation baselines by benchmarking HUMAN4D with state-of-the-art human pose estimation and 3D compression methods. For the former, we apply 2D and 3D pose estimation algorithms both on single- and multi-view data cues. For the latter, we benchmark open-source 3D codecs on volumetric data respecting online volumetric video encoding and steady bit-rates. Furthermore, qualitative and quantitative visual comparison between mesh-based volumetric data reconstructed in different qualities showcases the available options with respect to 4D representations. HUMAN4D is introduced to the computer vision and graphics research communities to enable joint research on spatio-temporally aligned pose, volumetric, mRGBD and audio data cues. The dataset and its code are available <a class="link-external link-https" href="https://tofis.github.io/myurls/human4d" rel="external noopener nofollow">this https URL</a>.

HuMoMM: A Multi-Modal Dataset and Benchmark for Human Motion Analysis

MOtion Human Parsing - A New Benchmark for 3D Human Parsing.

HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling

MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations

The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion Prediction

Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset

HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects

DiverseMotion: Towards Diverse Human Motion Generation Via Discrete Diffusion

PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding.

New multi-view human motion capture framework

EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs

HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

A Benchmark Dataset and Comparison Study for Multi-Modal Human Action Analytics

MoVi: A Large Multipurpose Motion and Video Dataset

Large Motion Model for Unified Multi-Modal Motion Generation

HUMBI: A Large Multiview Dataset of Human Body Expressions and Benchmark Challenge

HOI-M3:Capture Multiple Humans and Objects Interaction within Contextual Environment

Multi-Transmotion: Pre-trained Model for Human Motion Prediction