Abstract:Sign language is a beautiful visual language and is also the primary language used by speaking and hearing-impaired people. However, sign language has many complex expressions, which are difficult for the public to understand and master. Sign language recognition algorithms will significantly facilitate communication between hearing-impaired people and normal people. Traditional continuous sign language recognition often uses a sequence learning method based on Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM). These methods can only learn spatial and temporal features separately, which cannot learn the complex spatial-temporal features of sign language. LSTM is also difficult to learn long-term dependencies. To alleviate these problems, this paper proposes a multi-view spatial-temporal continuous sign language recognition network. The network consists of three parts. The first part is a Multi-View Spatial-Temporal Feature Extractor Network (MSTN), which can directly extract the spatial-temporal features of RGB and skeleton data; the second is a sign language encoder network based on Transformer, which can learn long-term dependencies; the third is a Connectionist Temporal Classification (CTC) decoder network, which is used to predict the whole meaning of the continuous sign language. Our algorithm is tested on two public sign language datasets SLR-100 and PHOENIX-Weather 2014T (RWTH). As a result, our method achieves excellent performance on both datasets. The word error rate on the SLR-100 dataset is 1.9%, and the word error rate on the RWTHPHOENIX-Weather dataset is 22.8%.

RNN-Transducer Based Chinese Sign Language Recognition

Chinese Sign Language Recognition with Sequence to Sequence Learning.

Sign Language Recognition with Long Short-Term Memory.

Video-Based Sign Language Recognition Without Temporal Segmentation

Hierarchical lstm for sign language translation

Hierarchical LSTM for Sign Language Translation.

Attention-Based 3D-Cnns for Large-Vocabulary Sign Language Recognition.

A SRN/HMM System for Signer-Independent Continuous Sign Language Recognition

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Chinese sign language recognition with adaptive HMM

Sign language recognition based on lightweight 3D CNNs and Transformer

Signer-Independent Continuous Sign Language Recognition Based on SRN/HMM

Natural Language-Assisted Sign Language Recognition

Sign Language Recognition Based on R(2+1)D With Spatial-Temporal-Channel Attention

Key Action and Joint CTC-Attention Based Sign Language Recognition

A Chinese Sign Language Recognition System Based on SOFM/SRN/HMM

A new system for Chinese sign language recognition

Two-Stream Network for Sign Language Recognition and Translation

A Novel Chinese Sign Language Recognition Method Based on Keyframe-Centered Clips

StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition