Emotional Head Motion Predicting from Prosodic and Linguistic Features

Minghao Yang,Jinlin Jiang,Jianhua Tao,Kaihui Mu,Hao Li
DOI: https://doi.org/10.1007/s11042-016-3405-3
IF: 2.577
2016-01-01
Multimedia Tools and Applications
Abstract:Emotional head motion plays an important role in human-computer interaction (HCI), which is one of the important factors to improve users’ experience in HCI. However, it is still not clear how head motions are influenced by speech features in different emotion states. In this study, we aim to construct a bimodal mapping model from speech to head motions, and try to discover what kinds of prosodic and linguistic features have the most significant influence on emotional head motions. A two-layer clustering schema is introduced to obtain reliable clusters from head motion parameters. With these clusters, an emotion related speech to head gesture mapping model is constructed by a Classification and Regression Tree (CART). Based on the statistic results of CART, a systematical statistic map of the relationship between speech features (including prosodic and linguistic features) and head gestures is presented. The map reveals the features which have the most significant influence on head motions in long or short utterances. We also make an analysis on how linguistic features contribute to different emotional expressions. The discussions in this work provide important references for realistic animation of speech driven talking-head or avatar.
What problem does this paper attempt to address?