Enhancing Signer-Independent Recognition of Isolated Sign Language through Advanced Deep Learning Techniques and Feature Fusion

Ali Akdag,Omer Kaan Baykan

DOI: https://doi.org/10.3390/electronics13071188

IF: 2.9

2024-03-25

Electronics

Abstract:Sign Language Recognition (SLR) systems are crucial bridges facilitating communication between deaf or hard-of-hearing individuals and the hearing world. Existing SLR technologies, while advancing, often grapple with challenges such as accurately capturing the dynamic and complex nature of sign language, which includes both manual and non-manual elements like facial expressions and body movements. These systems sometimes fall short in environments with different backgrounds or lighting conditions, hindering their practical applicability and robustness. This study introduces an innovative approach to isolated sign language word recognition using a novel deep learning model that combines the strengths of both residual three-dimensional (R3D) and temporally separated (R(2+1)D) convolutional blocks. The R3(2+1)D-SLR network model demonstrates a superior ability to capture the intricate spatial and temporal features crucial for accurate sign recognition. Our system combines data from the signer's body, hands, and face, extracted using the R3(2+1)D-SLR model, and employs a Support Vector Machine (SVM) for classification. It demonstrates remarkable improvements in accuracy and robustness across various backgrounds by utilizing pose data over RGB data. With this pose-based approach, our proposed system achieved 94.52% and 98.53% test accuracy in signer-independent evaluations on the BosphorusSign22k-general and LSA64 datasets.

engineering, electrical & electronic,computer science, information systems,physics, applied

What problem does this paper attempt to address?

The paper aims to address several key issues in Sign Language Recognition (SLR) systems, particularly for the recognition of isolated sign language vocabulary. Specifically: 1. **Improving Accuracy and Robustness**: Although existing SLR technologies have made progress, there are still challenges in capturing the dynamic complexity of sign language, including hand movements, facial expressions, and body postures. Additionally, the practical applicability and robustness of these systems are limited under different background or lighting conditions. Therefore, one of the goals of the paper is to enhance the accuracy and robustness of isolated sign language vocabulary recognition by proposing a new deep learning model. 2. **Integrating Multiple Features**: To achieve more comprehensive sign language recognition, researchers believe it is necessary to consider factors such as facial expressions, body language, and gestures. To this end, the paper proposes a novel R3(2+1)D-SLR network model that combines body, hand, and facial features, and uses a Support Vector Machine (SVM) for classification. 3. **Addressing Background Variations**: Given the varying background conditions in the real world, the paper emphasizes the necessity of testing the proposed system under different backgrounds to verify its adaptability and reliability. By utilizing pose data instead of traditional RGB images, the system demonstrates better performance in handling background noise. 4. **Facilitating Communication Between Deaf and Hearing People**: The ultimate goal is to eliminate communication barriers between deaf and hearing people, enabling the deaf to better integrate into society. By converting sign language gestures into text or speech, this research aims to establish an efficient and accurate SLR system, thereby enhancing interaction and understanding between deaf and hearing individuals. In summary, this paper is primarily dedicated to developing an innovative SLR method aimed at improving the accuracy and robustness of isolated sign language vocabulary recognition, especially when facing diverse background conditions. It also emphasizes the importance of non-manual features and their positive impact on sign language recognition performance.

Enhancing Signer-Independent Recognition of Isolated Sign Language through Advanced Deep Learning Techniques and Feature Fusion

Isolated sign language recognition through integrating pose data and motion history images

Sign Language Recognition Using Graph and General Deep Neural Network Based on Large Scale Dataset

Manual and non-manual sign language recognition framework using hybrid deep learning techniques

StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition

Natural Language-Assisted Sign Language Recognition

Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation

Korean Sign Language Alphabet Recognition Through the Integration of Handcrafted and Deep Learning-Based Two-Stream Feature Extraction Approach

Automated sign language detection and classification using reptile search algorithm with hybrid deep learning

Improving Continuous Sign Language Recognition with Consistency Constraints and Signer Removal

Isolated Video-Based Sign Language Recognition Using a Hybrid CNN-LSTM Framework Based on Attention Mechanism

Recognition of Indian Sign Language (ISL) Using Deep Learning Model

Fine-tuning of sign language recognition models: a technical report

Independent Sign Language Recognition with 3D Body, Hands, and Face Reconstruction

Spatial–temporal attention with graph and general neural network-based sign language recognition

Hear Sign Language: A Real-Time End-to-End Sign Language Recognition System

Deepsign: Sign Language Detection and Recognition Using Deep Learning

Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble