Abstract:Articulated skeleton extraction or learning has been extensively studied for 2D (e.g., images and video) and 3D (e.g., volume sequences, motion capture, and mesh sequences) data. Nevertheless, robustly and accurately learning 3D articulated skeletons from point set sequences captured by a single consumer-grade depth camera still remains challenging, since such data are often corrupted with substantial noise and outliers. Relatively few approaches have been proposed to tackle this problem. In this paper, we present a novel unsupervised framework to address this issue. Specifically, we first build one-to-one point correspondences among the point cloud frames in a sequence with our non-rigid point cloud registration algorithm. We then generate a skeleton involving a reasonable number of joints and bones with our skeletal structure extraction algorithm. We lastly present an iterative Linear Blend Skinning based algorithm for accurate joints learning. At the end, our method can learn a quality articulated skeleton from a single 3D point sequence possibly corrupted with noise and outliers. Through qualitative and quantitative evaluations on both publicly available data and in-house Kinectcaptured data, we show that our unsupervised approach soundly outperforms state of the art techniques in terms of both quality (i.e., visual) and accuracy (i.e., Euclidean distance error metric). Moreover, the poses of our extracted skeletons are even comparable to those by KinectSDK, a well-known supervised pose estimation technique; for example, our method and KinectSDK achieves similar distance errors of 0.0497 and 0.0521.

Performance Capture Of Interacting Characters With Handheld Kinect

Human Performance Capture Using Multiple Handheld Kinects

Free-viewpoint video of human actors using multiple handheld Kinects.

Full-body Motion Capture for Multiple Closely Interacting Persons.

3D Articulated Skeleton Extraction Using a Single Consumer-Grade Depth Camera.

Look Ma, no markers: holistic performance capture without the hassle

Leveraging depth cameras and wearable pressure sensors for full-body kinematics and dynamics capture

LiveCap: Real-time Human Performance Capture from Monocular Video

Human Motion Capture Using Wireless Inertial Sensors

Markerless motion capture of interacting characters using multi-view image segmentation

Markerless motion capture of multiple characters using multiview image segmentation

Markerless hand pose estimation system using Kinect

Leveraging Two Kinect Sensors for Accurate Full-Body Motion Capture

Human Motion Tracking by Multiple RGBD Cameras.

Animation Of 3d Characters From Single Depth Camera

Dynamic Human Body Reconstruction and Motion Tracking with Low-Cost Depth Cameras

Marker-Less 3d Human Motion Capture With Monocular Image Sequence And Height-Maps

Accurate realtime full-body motion capture using a single depth camera

Outdoor Markerless Motion Capture with Sparse Handheld Video Cameras

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

Video Motion Capture by Silhouette Analysis and Pose Optimization