MetaRL-SE: a few-shot speech enhancement method based on meta-reinforcement learning
Weili Zhou,Ruijie Ji,Jinxiong Lai
DOI: https://doi.org/10.1007/s11042-023-14945-6
IF: 2.577
2023-04-26
Multimedia Tools and Applications
Abstract:The goal of speech enhancement is to reduce and suppress the noise in noisy speech and improve the quality and intelligibility of damaged speech. With the development of deep learning, the performance of SE has been significantly improved. However, deep learning relies on massive training data, and the lack of data is an important reason for the failure and difficulty of many algorithms. Aiming at this problem, this paper proposed a novel meta-reinforcement learning framework, focusing on the few-shot learning for speech enhancement. Specifically, first, a reinforcement learning based meta-learner is proposed which initializes the actions by a finite number of T-F masks, and the related action-value function is developed. Second, to optimize the model, this paper develops the reward calculation for reinforcement learning by using the user perception. Third, the model-agnostic Meta learning (MAML) algorithm is applied to fully utilize the limited data to improve the generalization of the meta-learner and towards better generalization of learning new tasks. The experiment results show that in terms of subjective and objective measurements, this work achieves at least improvement of 1.3%~12.5% for 1-shot case and 3.1% ~14.3% for 5-shot case in contrast to the state-of-the-arts DNN based SE methods in challenging conditions, where the environment noises are diverse, and the signals are non-stationary.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering