Speech Emotion Recognition Based on Prosodic Segment Level Features

Li Haifeng
2009-01-01
Abstract:In the field of speech emotion recognition,the emotion features of different emotional utterances are commonly extracted at the same segment length level.This ignores the variation of the human ear's sensitive prosodic segment length for different emotions.In the present system the best segment length for emotion recognition of each emotion was first obtained through experiments.A multi-network model named the prosodic segment level Elman network was then proposed to identify emotions using certain sensitive prosodic segment level features and then to combine the recognition results of each sub-network.Tests show that the recognition rate of sensitive prosodic segment level features is 67.9%,much higher than the rate obtained by fixed-length segment level features.
What problem does this paper attempt to address?