An Approach Based on 1D Fully Convolutional Network for Continuous Sign Language Recognition and Labeling
Wang Fei,Li Chen,Liu Chuan-wen,Zeng Zhen,Xu Ke,Wu Jin-xiu
DOI: https://doi.org/10.1007/s00521-022-07415-x
2022-01-01
Neural Computing and Applications
Abstract:Sign language is the most important communication method for people with speech impairments, and automatic sign language recognition helps them communicate with normal people without barriers. For portability considerations, the device that integrates surface electromyography (sEMG) sensors and inertial measurement units (IMU) is used to collect and obtain 1D 14-channel sign language data. However, 1D data are not readable by humans. In order to accurately obtain effective sign language to better complete word-level and continuous sign language recognition, synchronized video and a lot of labor costs are needed. In this paper, we propose an approach based on 1D fully convolutional network (FCN) called as SignD-Net, which can be used for labeling and recognition of 1D time series sign language data. SignD-Net compares sign language labeling with object detection and uses YOLO as the basis to assign a bounding box to each predicted object. Using the optimal 1D-CNN model selected by the experiments, continuous sign language labeling and recognition can be realized. With limited data, the model is pre-trained with word-level sign language data and simulated sentence-level data, and at the end of the training, real collected and manually labeled sign language data are used. Through experiments on sign language test data, SignD-Net has been proven to have excellent capabilities, achieving a mean average precision (mAP) of 99.18% on the labeling task, and achieving a sentence-level accuracy of up to 98.74% on the recognition task.