Dyhand: dynamic hand gesture recognition using BiLSTM and soft attention methods

Rohit Pratap Singh,Laiphrakpam Dolendro Singh
DOI: https://doi.org/10.1007/s00371-024-03307-4
IF: 2.835
2024-03-19
The Visual Computer
Abstract:Hand gesture recognition is an essential task in computer vision. It is the most intuitive and natural medium for communication when dealing with computers. Recently, with the advent of innovative technologies and high performing computer systems, there has been a surge in the research of Gesture Recognition. Traditional approaches to modelling skeletons are typically based on hand-crafted components or traversal algorithms, leading to limited expressive capacity and generalisation challenges. In this work, we present a novel dynamic skeleton model based on BiLSTM and soft attention named DyHand that mitigates the challenges of intra-class and inter-class variability of gesture classes to a great extent. The comparison of our model with state-of-the-art approaches on the two benchmark data sets with various data augmentation techniques is reported. The proposed approach yields the best results, achieving 97.14 and 96.42% recognition accuracy in the 14 and 28 gesture categories, respectively, for the DHG-14/28 data set and comparable recognition accuracy of 93.98% on 14 gesture classes and 87.86% on 28 gesture classes, respectively, in case of SHREC'17 data set.
computer science, software engineering
What problem does this paper attempt to address?