Abstract:In the realm of Human-Computer Interaction (HCI), the importance of hands cannot be overstated. Hands serve as a fundamental means of communication, expression, and interaction in the physical world. In recent years, Augmented Reality (AR) has emerged as a next-generation technology that seamlessly merges the digital and physical worlds, providing transformative experiences across various domains. In this context, accurate hand pose and shape estimation plays a crucial role in enabling natural and intuitive interactions within AR environments. Augmented Reality, with its ability to overlay digital information onto the real world, has the potential to revolutionize how we interact with technology. From gaming and education to healthcare and industrial training, AR has opened up new possibilities for enhancing user experiences. This study proposes an innovative approach for hand pose and shape estimation in AR applications. The methodology commences with the utilization of a pre-trained Single Shot Multi-Box (SSD) model for hand detection and cropping. The cropped hand image is then transformed into the HSV color model, followed by applying histogram equalization on the value band. To precisely isolate the hand, specific bounds are set for each band of the HSV color space, generating a mask. To refine the mask and diminish noise, contouring techniques are applied to the mask, and gap-filling methods are employed. The resultant refined mask is then combined with the original cropped image through logical AND operations to accurately delineate the hand boundaries. This meticulous approach ensures robust hand detection even in complex scenes. To extract pertinent features, the detected hand undergoes two concurrent processes. Firstly, the Scale-Invariant Feature Transform (SIFT) algorithm identifies distinctive keypoints on the hand's outer surface. Simultaneously, a pre-trained lightweight Convolutional Neural Network (CNN), namely MobileNet, is employed to extract 3D hand landmarks, the hand's center (middle finger metacarpophalangeal joint), and handedness information. These extracted features, encompassing hand keypoints, landmarks, center, and handedness, are aggregated and compiled into a CSV file for further analysis. A Gated Recurrent Unit (GRU) is then employed to process the features, capturing intricate dependencies between them. The GRU model successfully predicts the 3D hand pose, achieving high accuracy even in dynamic scenarios. The evaluation results for the proposed model are very promising that the Mean Per Joint Position Error in 3D (MPJPE) is 0.0596 between the predicted pose and the ground truth hand landmarks, while the Percentage of Correct Keypoints (PCK) is 95%. Upon predicting the hand pose, a mesh representation is employed to reconstruct the 3D shape of the hand. This mesh provides a tangible representation of the hand's structure and orientation, enhancing the realism and usability of the AR application. By integrating sophisticated detection, feature extraction, and predictive modeling techniques, this method contributes to creating more immersive and intuitive AR experiences, thereby fostering the seamless fusion of the digital and physical worlds.

Mmhand: 3D Hand Pose Estimation Leveraging Mmwave Signals

Mpose: Environment- and Subject-Agnostic 3D Skeleton Posture Reconstruction Leveraging a Single Mmwave Device

Dynamic hand gesture recognition using hidden Markov models

Mmskeleton: 3D Human Skeleton Estimation Using Millimeter Wave Radar Sparse Point Clouds

Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation

Towards Smartphone-based 3D Hand Pose Reconstruction Using Acoustic Signals

mm-Pose: Real-Time Human Skeletal Posture Estimation Using mmWave Radars and CNNs

mmPose-FK: A Forward Kinematics Approach to Dynamic Skeletal Pose Estimation Using mmWave Radars

Capturing Human Pose Using Mmwave Radar.

3D Hand Pose and Shape Estimation from Monocular RGB Via Efficient 2D Cues

HandOS: 3D Hand Reconstruction in One Stage

MHMC: Real-Time Hand Motion Capture Using Millimeter-Wave Radar

Applying 3D Human Hand Pose Estimation to Teleoperation

SUPER: Seated Upper Body Pose Estimation using mmWave Radars

HMTNet:3D Hand Pose Estimation from Single Depth Image Based on Hand Morphological Topology

Skeleton-based Dynamic Hand Gesture Recognition using 3D Depth Data

3D Hand Pose and Shape Estimation from Single RGB Image for Augmented Reality

M-Gesture : Person-Independent Real-Time In-Air Gesture Recognition Using Commodity Millimeter Wave Radar

QMGR-Net: quaternion multi-graph reasoning network for 3D hand pose estimation

Articulated-model based upper-limb pose estimation

UltraGlove: Hand Pose Estimation with Mems-Ultrasonic Sensors