Abstract:Gesture recognition is an indispensable component of natural and efficient human-computer interaction technology, particularly in desktop-level applications, where it can significantly enhance people's productivity. However, the current gesture recognition community lacks a suitable desktop-level (top-view perspective) dataset for lightweight gesture capture devices. In this study, we have established a dataset named GR4DHCI. What distinguishes this dataset is its inherent naturalness, intuitive characteristics, and diversity. Its primary purpose is to serve as a valuable resource for the development of desktop-level portable applications. GR4DHCI comprises over 7,000 gesture samples and a total of 382,447 frames for both Stereo IR and skeletal modalities. We also address the variances in hand positioning during desktop interactions by incorporating 27 different hand positions into the dataset. Building upon the GR4DHCI dataset, we conducted a series of experimental studies, the results of which demonstrate that the fine-grained classification blocks proposed in this paper can enhance the model's recognition accuracy. Our dataset and experimental findings presented in this paper are anticipated to propel advancements in desktop-level gesture recognition research.

What problem does this paper attempt to address?

This paper aims to address the lack of suitable datasets in the field of desktop-level gesture recognition. Current gesture recognition systems, especially in desktop applications, need improvement in efficiency and naturalness. The researchers have established a large-scale multimodal gesture recognition dataset named GR4DHCI, which consists of 7,339 dynamic gesture samples performed by 27 different hand poses, totaling 382,447 frames, divided into infrared and skeleton modes. The dataset emphasizes naturalness, intuitiveness, and diversity to accommodate long-term, fatigue-free use. By conducting experiments on the GR4DHCI dataset, the researchers propose a fine-grained classification block based on infrared images and skeleton motion, which improves the model's recognition accuracy. The experimental results show that this approach improves the recognition accuracy of both infrared and skeleton modes by 2.64% and 7.75% respectively. The paper also compares the GR4DHCI dataset with other existing gesture recognition datasets, pointing out that GR4DHCI is the first dataset designed specifically for desktop-level (overhead view) gesture recognition, covering a variety of hand poses and angle changes, increasing the diversity and authenticity of the data. In addition, the paper explores existing gesture recognition methods, including spatiotemporal networks and graph convolutional networks. Through experimental evaluation, the latest techniques on the GR4DHCI dataset, such as Res3D + ConvLSTM + MobileNet and TL-GCN, show improved performance when combined with the fine-grained classification block, demonstrating the effectiveness of the dataset and method. The paper expects GR4DHCI to contribute to the advancement of research in desktop-level gesture recognition.

A multimodal gesture recognition dataset for desktop human-computer interaction

Dynamic hand gesture recognition using hidden Markov models

RGC: Reliable Gesture Classification Via Wearables Using GANs-Based Data Augmentation.

Real-Time Hand Gesture Recognition Using RGB-D Sensor

Surface EMG-Based Inter-Session Gesture Recognition Enhanced by Deep Domain Adaptation.

Gesture Recognition with a 3-D Accelerometer

Design of hand gesture recognition system for human-computer interaction

DHGD: Dynamic Hand Gesture Dataset for Skeleton-Based Gesture Recognition and Baseline Evaluations

3D Intuitive Gesture Interaction via Motion Sensing

sEMG-based Gesture-Free Hand Intention Recognition: System, Dataset, Toolbox, and Benchmark Results

2MLMD: Multi-modal Leap Motion Dataset for Home Automation Hand Gesture Recognition Systems

Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

Stable And Real-Time Hand Gesture Recognition Based On Rgb-D Data

A Real-Time Approach to the Spotting, Representation, and Recognition of Hand Gestures for Human–Computer Interaction

Depth camera based hand gesture recognition and its applications in Human-Computer-Interaction

Improving dynamic gesture recognition in untrimmed videos by an online lightweight framework and a new gesture dataset ZJUGesture

A Framework Of Real Time Hand Gesture Vision Based Human-Computer Interaction

NatSGD: A Dataset with Speech, Gestures, and Demonstrations for Robot Learning in Natural Human-Robot Interaction

Research Progress of Human–Computer Interaction Technology Based on Gesture Recognition

A Real Time Vision-Based Hand Gestures Recognition System

Online human gesture recognition from motion data streams.