Multimodal Fusion Based on LSTM and a Couple Conditional Hidden Markov Model for Chinese Sign Language Recognition

Q. Xiao,Minying Qin,Peng Guo,Yidan Zhao
DOI: https://doi.org/10.1109/ACCESS.2019.2925654
IF: 3.9
2019-06-28
IEEE Access
Abstract:A novel multimodal fusion approach is proposed for Chinese sign language (CSL) recognition. This framework, the LSTM2+CHMM model, uses dual long short-term memory (LSTM) and a couple hidden Markov model (CHMM) to fuse hand and skeleton sequence information. Novel contributions, first, include a unique hand segmentation algorithm using power rate transforms and the RGB-D image fusion. This approach effectively overcomes common limitations, such as complex backgrounds, inconsistent lighting, and variable skin tones. Then, as a result, the proposed skeleton-hand fusion framework can be used for the vision-based sign language recognition (SLR) of non-specific people in non-specific environments. Finally, this LSTM2+CHMM model combines the probability theory with a neural network to provide a unified methodology for multiple-sequence fusion. The proposed SLR framework was tested using the two CSL datasets, and the experimental results showed it to be effective.
Computer Science
What problem does this paper attempt to address?