Deep Learning Based Video Spatio-Temporal Modeling for Emotion Recognition

Rubén D. Fonnegra,Gloria M. Díaz
DOI: https://doi.org/10.1007/978-3-319-91238-7_32
2018-01-01
Abstract:Affective Computing is a growing research area, which aims to determine the emotional user states through their conscious and unconscious actions and use it to modify the machine interaction. This paper investigates the discriminative abilities of convolutional and recurrent neural networks to modeling spatio-temporal features from video sequences of the face region. In a deep learning architecture, dense convolutional layers are used for analyzing spatial information changes in frames during short time periods, while dense recurrent layers are used to model changes in frames as temporal sequences that change across the time. Those layers are then connected to a multilayer perceptron (MLP) to perform the classification task, which consists in to distinguish between six different emotion categories. The performance was twofold evaluated: gender independent and gender-dependent classifications. Experimental results show that the proposed approach achieves an accuracy of \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$81.84\%$$\end{document}, in the gender independent experiment, which outperforms previous works using the same experimental data. In the gender-dependent experiment, accuracy was \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$80.79\%$$\end{document} and \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$82.75\%$$\end{document} for male and female, respectively.
What problem does this paper attempt to address?