Speech Enhancement Based on Cyclegan with Noise-informed Training

Wen-Yuan Ting,Syu-Siang Wang,Hsin-Li Chang,Borching Su,Yu Tsao
DOI: https://doi.org/10.48550/arXiv.2110.09924
2021-10-19
Audio and Speech Processing
Abstract:Cycle-consistent generative adversarial networks (CycleGAN) were successfully applied to speech enhancement (SE) tasks with unpaired noisy-clean training data. The CycleGAN SE system adopted two generators and two discriminators trained with losses from noisy-to-clean and clean-to-noisy conversions. CycleGAN showed promising results for numerous SE tasks. Herein, we investigate a potential limitation of the clean-to-noisy conversion part and propose a novel noise-informed training (NIT) approach to improve the performance of the original CycleGAN SE system. The main idea of the NIT approach is to incorporate target domain information for clean-to-noisy conversion to facilitate a better training procedure. The experimental results confirmed that the proposed NIT approach improved the generalization capability of the original CycleGAN SE system with a notable margin.
What problem does this paper attempt to address?