Human Motion Tracking with Less Constraint of Initial Posture from a Single RGB-D Sensor

Chen Liu,Anna Wang,Chunguang Bu,Wenhui Wang,Haijing Sun
DOI: https://doi.org/10.3390/s21093029
IF: 3.9
2021-04-26
Sensors
Abstract:High-quality and complete human motion 4D reconstruction is of great significance for immersive VR and even human operation. However, it has inevitable self-scanning constraints, and tracking under monocular settings also has strict restrictions. In this paper, we propose a human motion capture system combined with human priors and performance capture that only uses a single RGB-D sensor. To break the self-scanning constraint, we generated a complete mesh only using the front view input to initialize the geometric capture. In order to construct a correct warping field, most previous methods initialize their systems in a strict way. To maintain high fidelity while increasing the easiness of the system, we updated the model while capturing motion. Additionally, we blended in human priors in order to improve the reliability of model warping. Extensive experiments demonstrated that our method can be used more comfortably while maintaining credible geometric warping and remaining free of self-scanning constraints.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to reduce the strict requirements for the initial pose and overcome the self - scanning constraint when using a single RGB - D sensor for human motion capture. Specifically, the authors propose a method that combines human prior knowledge and performance capture, aiming to achieve high - fidelity human surface reconstruction and motion tracking while improving the ease - of - use and robustness of the system. ### Key problems solved in the paper: 1. **Reducing the requirements for the initial pose**: Traditional template - based or data - accumulation - based methods often require strict initialization poses, which limit the flexibility of these methods in practical applications. By introducing human prior knowledge, this paper reduces the strict requirements for the initial pose, allowing users to start motion capture in a more natural pose. 2. **Overcoming the self - scanning constraint**: The self - scanning constraint means that when using a single sensor for 3D reconstruction, due to the limitation of the viewing angle, some parts may not be scanned. This paper overcomes this constraint by generating a complete front - view mesh to initialize geometric capture. 3. **Improving the accuracy of motion tracking**: In order to improve the accuracy of motion tracking while maintaining high - fidelity, this paper proposes a new optimization pipeline that combines human prior knowledge and volume fusion technology. This method can not only accurately track human motion, but also maintain the accuracy of geometric details in new fusion areas. ### Method overview: - **Initialization stage**: - Use NormalGAN to generate a complete human mesh with specific details. - Align the generated mesh with the current depth through a non - linear optimization method to initialize the TSDF volume. - Use the SMPL model and FrankMocap to initialize the human pose to ensure the credibility of the complete mesh deformation. - **Motion capture stage**: - Constrain the human pose parameters by point clouds and 3D pose, using the predicted human pose as a prior. - Capture non - rigid deformations and achieve high - fidelity surface reconstruction by solving the surface tracking energy function. - Combine human prior knowledge and depth information to refine the geometric details of the model. ### Technical contributions: - Proposed a human volume capture method based on human prior knowledge, which effectively reduces the strict requirements for the initial pose while maintaining accurate motion tracking. - Designed a new optimization pipeline that combines human prior knowledge and volume fusion technology to overcome the self - scanning constraint. - Generated a complete human mesh with geometric details by a data - driven implicit occupancy representation method, improving the accuracy of surface reconstruction and motion tracking. In conclusion, the method proposed in this paper has made significant progress in improving the ease - of - use and robustness of human motion capture, especially when using a single RGB - D sensor.