Learning Long-Term Temporal Contexts Using Skip RNN for Continuous Emotion Recognition

Jian Huang,Bin Liu,Jianhua Tao
DOI: https://doi.org/10.1016/j.vrih.2020.11.005
2021-01-01
Virtual Reality & Intelligent Hardware
Abstract:Background Continuous emotion recognition as a function of time assigns emotional values to every frame in a sequence. Incorporating long-term temporal context information is essential for continuous emotion recognition tasks. Methods For this purpose, we employ a window of feature frames in place of a single frame as inputs to strengthen the temporal modeling at the feature level. The ideas of frame skipping and temporal pooling are utilized to alleviate the resulting redundancy. At the model level, we leverage the skip recurrent neural network to model the long-term temporal variability by skipping trivial information for continuous emotion recognition. Results The experimental results using the AVEC 2017 database demonstrate that our proposed methods are beneficial to a performance improvement. Further, the skip long short-term memory (LSTM) model can focus on the critical emotional state when training the models, thereby achieving a better performance than the LSTM model and other methods.
What problem does this paper attempt to address?