Rules Based Feature Modification for Affective Speaker Recognition

Zhaohui Wu,Dongdong Li,Yingchun Yang
DOI: https://doi.org/10.1109/ICASSP.2006.1660107
2006-05-14
Abstract:One of the largest challenges in speaker recognition applications is dealing with speaker-emotion variability. In this paper, we further investigate the rules based feature modification for robust speaker recognition with emotional speech. Specifically, we learn the rules of prosodic features modification from a small amount of the content matched source-target pairs. Features with emotion information are adapted from the prevalent neutral features by applying the modification rules. The converted features are trained together with the neutral features to build the speaker models. The effects of individual and combined modifications of duration, pitch and amplitude are also studied using EPST dataset recorded by 8 professional actors with 14 kinds of emotion expressiveness. It demonstrates that duration modifications play the most important role; and that, pitch modifications are more effective than amplitude modifications. Promising result with an improved identification rate by 7.83% is achieved compared to the traditional speaker recognition
Computer Science
What problem does this paper attempt to address?