Common Discriminative Latent Space Learning for Cross-Domain Speech Emotion Recognition

Siqi Fu,Peng Song,Hao Wang,Zhaowei Liu,Wenming Zheng
DOI: https://doi.org/10.1109/tcss.2024.3476325
2024-01-01
IEEE Transactions on Computational Social Systems
Abstract:Cross-domain speech emotion recognition (SER) has received increasing attention in recent years. Existing transfer subspace learning and regression-based SER methods have the following drawbacks. The features in the subspace are still insufficiently representative and discriminative, and direct regression would lead to information loss. To address these problems, we present a novel common discriminative latent space learning (CDLSL) method for cross-domain SER. To be specific, we first obtain a common latent space by imposing a projection matrix on the cross-domain data. Meanwhile, we impose an uncorrelated constraint on the projection matrix to ensure that the features are representative and discriminative after dimension reduction. Then, we implement a graph regularization term on the latent representations of the samples to capture the local similarity information. Furthermore, to obtain a more discriminative common latent space, we introduce the label information by aligning the latent space with the relaxed label space, while mitigating the information loss for regression. Extensive experimental results validate the superiority of the proposed method over the state-of- the-art competitors.
What problem does this paper attempt to address?