Robust Semantic Communications for Speech Transmission

Zhenzi Weng,Zhijin Qin
2024-04-25
Abstract:In this paper, we propose a robust semantic communication system for speech transmission, named Ross-S2T, by delivering the essential semantic information. Particularly, we consider the speech-to-text translation (S2TT) as the transmission goal. First, a deep semantic encoder is developed to directly convert speech in the source language to textual features associated with the target language, facilitating the end-to-end (E2E) semantic exchange to perform the S2TT task and reducing the transmission data without performance degradation. To mitigate semantic impairments inherent in the corrupted speech, a novel generative adversarial network (GAN)-enabled deep semantic compensator is established to estimate the lost semantic information within the speech and extract deep semantic features simultaneously, which enables robust semantic transmission for corrupted speech. Furthermore, a semantic probe-aided compensator is devised to enhance the semantic fidelity of recovered semantic features and improve the understandability of the target text. According to simulation results, the proposed Ross-S2T exhibits superior S2TT performance compared to conventional approaches and high robustness against semantic impairments.
Audio and Speech Processing
What problem does this paper attempt to address?