Domain adaptive dual-relaxation regression for speech emotion recognition

Hao Wang,Peng Song,Shenjie Jiang,Run Wang,Shaokai Li,Tao Liu
DOI: https://doi.org/10.1016/j.apacoust.2024.110118
IF: 3.614
2024-06-20
Applied Acoustics
Abstract:With the development of artificial intelligence techniques, speech emotion recognition (SER) has become a significant research problem. Traditional SER methods use the data from the same domain for training and testing. However, in reality, the data often come from different domains with various languages, acquisition methods, and speaker characteristics, etc. This domain discrepancy might lead to lower recognition performance and poorer generalization ability of the model. To tackle this issue, we present a novel domain adaptation (DA) method, called domain adaptive dual-relaxation regression (DADR), for cross-domain SER. In concrete, our approach first introduces a non-negative relaxation matrix to relax the strict zero-one label matrix of the source domain while learning a relaxed label matrix with a similar structure to the source labels. Then, the relaxed label matrix is utilized to learn a more discriminative transformation matrix. Further, we reduce the divergence between the source and target domains by constructing the similarity and dissimilarity graphs for cross-domain data. In addition, we add an l2,1 -norm to the transformation matrix to obtain more discriminative features. Experiments conducted on several cross-domain SER tasks show that our model achieves excellent recognition performance compared with several advanced algorithms.
acoustics
What problem does this paper attempt to address?