A Deep Bidirectional Long Short-Term Memory Based Multi-Scale Approach for Music Dynamic Emotion Prediction

Xinxing Li,Haishu Xianyu,Jiashen Tian,Wenxiao Chen,Fanhang Meng,Mingxing Xu,Lianhong Cai
DOI: https://doi.org/10.1109/icassp.2016.7471734
2016-01-01
Abstract:Music Dynamic Emotion Prediction is a challenging and significant task. In this paper, We adopt the dimensional valence-arousal (V-A) emotion model to represent the dynamic emotion in music. Considering the high context correlation among the music feature sequence and the advantage of Bidirectional Long Short-Term Memory (BLSTM) in capturing sequence information, we propose a multi-scale approach, Deep BLSTM (DBLSTM) based multi-scale regression and fusion with Extreme Learning Machine (ELM), to predict the V-A values in music. We achieved the best performance on the database of Emotion in Music task in MediaEval 2015 compared with other submitted results. The experimental results demonstrated the effectiveness of our novel proposed multi-scale DBLSTM-ELM model.
What problem does this paper attempt to address?