Abstract:Recently, there has been a surge of significant interest on application of Deep Learning (DL) models to autonomously perform hand gesture recognition using surface Electromyogram (sEMG) signals. DL models are, however, mainly designed to be applied on sparse sEMG signals. Furthermore, due to their complex structure, typically, we are faced with memory constraints; require large training times and a large number of training samples, and; there is the need to resort to data augmentation and/or transfer learning. In this paper, for the first time (to the best of our knowledge), we investigate and design a Vision Transformer (ViT) based architecture to perform hand gesture recognition from High Density (HD-sEMG) signals. Intuitively speaking, we capitalize on the recent breakthrough role of the transformer architecture in tackling different complex problems together with its potential for employing more input parallelization via its attention mechanism. The proposed Vision Transformer-based Hand Gesture Recognition (ViT-HGR) framework can overcome the aforementioned training time problems and can accurately classify a large number of hand gestures from scratch without any need for data augmentation and/or transfer learning. The efficiency of the proposed ViT-HGR framework is evaluated using a recently-released HD-sEMG dataset consisting of 65 isometric hand gestures. Our experiments with 64-sample (31.25 ms) window size yield average test accuracy of 84.62 +/- 3.07%, where only 78, 210 number of parameters is utilized. The compact structure of the proposed ViT-based ViT-HGR framework (i.e., having significantly reduced number of trainable parameters) shows great potentials for its practical application for prosthetic control.

SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition

sEMG-based technology for silent voice recognition

Silent Speech Recognition Based on Surface Electromyography

Decoding silent speech from high-density surface electromyographic data using transformer

Silent Speech Decoding Using Spectrogram Features Based on Neuromuscular Activities

Hybrid Silent Speech Interface Through Fusion of Electroencephalography and Electromyography

Design and implementation of a silent speech recognition system based on sEMG signals: A neural network approach

Attention Bidirectional LSTM Networks Based Mime Speech Recognition Using Semg Data

Quality-aware Aggregated Conformal Prediction for Silent Speech Recognition

Speech neuromuscular decoding based on spectrogram images using conformal predictors with Bi-LSTM.

Convolutional Neural Network applied in mime speech recognition using sEMG data

Exploration on Channel-interactive Features in Silent Speech Recognition

Encoder-Decoder Architectures for Silent Speech Recognition Based on High-density Surface Electromyogram

Silent Speech Recognition based on sEMG and EEG Signals

Extracting Spatial Muscle Activation Patterns in Facial and Neck Muscles for Silent Speech Recognition Using High-Density sEMG

SLViT: Scale-Wise Language-Guided Vision Transformer for Referring Image Segmentation.

Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network

ViT-HGR: Vision Transformer-based Hand Gesture Recognition from High Density Surface EMG Signals

Silent Speech Recognition Based on High-Density Surface Electromyogram Using Hybrid Neural Networks

Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck

Silent Speech Recognition Based on Surface Electromyography Using a Few Electrode Sites under the Guidance from High-Density Electrode Arrays.