SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition

Ruiteng Zhang,Jianguo Wei,Xugang Lu,Yongwei Li,Junhai Xu,Di Jin,Jianhua Tao
DOI: https://doi.org/10.21437/interspeech.2023-1360
2023-01-01
Abstract:In cross-domain speech emotion recognition (SER), reducing the global probability distribution distance (GPDD) between different domains plays a crucial role in unsupervised domain adaptation (UDA), which can be naturally measured by optimal transport (OT). However, owing to the large intra-variations of emotion categories, samples distributed in overlap may induce negative transports. Moreover, OT only considers the GPDD and therefore cannot efficiently transport hard-discriminative samples without utilizing the local structures from intra-class distributions. We propose a self-supervised learning (SSL)assisted optimal transport (SOT) algorithm for cross-domain SER. First, we regularized OT's transport coupling to mitigate negative transports; then, we designed an SSL module to emphasize local intra-class structure to assist OT in capturing those nontransferable acknowledge. Cross-domain SER experimental results showed that SOT dramatically outperformed state-of-the-art UDAs.
What problem does this paper attempt to address?