Mmhand: 3D Hand Pose Estimation Leveraging Mmwave Signals

Hao Kong,Haoxin Lyu,Jiadi Yu,Linghe Kong,Junlin Yang,Yanzhi Ren,Hongbo Liu,Yi-Chao Chen
DOI: https://doi.org/10.1109/icdcs60910.2024.00102
2024-01-01
Abstract:Hand pose estimation is a key support for a variety of interactive applications including user interface control, sign language understanding, virtual reality modeling, etc. Existing approaches mainly exploit wearable devices such as gloves or bracelets to estimate hand poses, which may introduce high deploying costs and intrusive user experience. Others rely on vision technologies whereas they could face complicated illuminations and privacy leakage. In this paper, we present a millimeter wave (mmWave) signal-based 3D hand pose estimation system, mmHand, which utilizes a mmWave radar to generate 3D hand skeletons and reconstruct 3D hand meshes. mmHand first leverages mmWave signals to sense a hand and pre-process the signals. Then, mmHand extracts spatial and temporal features using a designed attention-based hourglass network (mmSpaceNet) and Long Short-Term Memory (LSTM), respectively. Based on the extracted features, mmHand further regresses hand joints in 3D space to generate 3D hand skeletons. Finally, 3D hand meshes that continuously describe hand poses with detailed surfaces are reconstructed through a hand Model with Articulated and Non-rigid defOrmations (MANO). Extensive experiments demonstrate that mmHand can accurately generate 3D hand skeletons with 18.3mm mean per joint position error and 95.1 % of correct key points, which indicates the effectiveness of mmHand on hand pose estimation.
What problem does this paper attempt to address?