Abstract:Speech enhancement is an important preprocessing step in a wide diversity of practical fields related to speech signals, and many signal-processing methods have already been proposed for speech enhancement. However, the lack of a comprehensive and quantitative evaluation of enhancement performance for multi-speech makes it difficult to choose an appropriate enhancement method for a multi-speech application. This work aims to study the implementation of several enhancement methods for multi-speech enhancement in indoor environments of T60 = 0 s and T60 = 0.3 s. Two types of enhancement approaches are proposed and compared. The first type is the basic enhancement methods, including delay-and-sum beamforming (DSB), minimum variance distortionless response (MVDR), linearly constrained minimum variance (LCMV), and independent component analysis (ICA). The second type is the robust enhancement methods, including improved MVDR and LCMV realized by eigendecomposition and diagonal loading. In addition, online enhancement performance based on the iteration of single-frame speech signals is researched, as is the comprehensive performance of various enhancement methods. The experimental results show that the enhancement effects of LCMV and ICA are relatively more stable in the case of basic enhancement methods; in the case of the improved enhancement algorithms, methods that employ diagonal loading iterations show better performance. In terms of online enhancement, DSB with frequency masking (FM) yields the best performance on the signal-to-interference ratio (SIR) and can suppress interference. The comprehensive performance test showed that LCMV and ICA yielded the best effects when there was no reverberation, while DSB with FM yielded the best SIR value when reverberation was present.

Speech enhancement based on Sparse Code Shrinkage employing multiple speech models

An Approach of Speech Enhancement by Sparse Code Shrinkage

Speech Enhancement with a GSC-like Structure Employing Sparse Coding

ICA-based MAP Speech Enhancement with Multiple Variable Speech Distribution Models

A Modified Speech Enhancement Algorithm Using a Universal Speaker Model

Speech enhancement based on estimating expected values of speech cepstra

Sparse representations for speech enhancement

Supervised Single Channel Dual Domains Speech Enhancement Using Sparse Non-Negative Matrix Factorization

Speech Enhancement Based on Nonparametric Bayesian Method

Exploring Conventional Enhancement and Separation Methods for Multi‐speech Enhancement in Indoor Environments

Speech Enhancement for Nonstationary Noise Environments

Speech Enhancement Using Group Complementary Joint Sparse Representations in Modulation Domain

Speech Enhancement Algorithm Based on Spectral Subtraction

Densely Connected Multi-Stage Model with Channel Wise Subband Feature for Real-Time Speech Enhancement.

Multiple Modules Speech Enhancement in Mixed Noise and Low SNR Environments

Supervised Monaural Speech Enhancement Using Complementary Joint Sparse Representations.

A Speech Enhancement Algorithm Based on Computational Auditory Scene Analysis

Speech Enhancement with Intelligent Neural Homomorphic Synthesis

Study on Sparse Coding Speech Enhancement

Speech Enhancement by Denoising and Dereverberation Using a Generalized Sidelobe Canceller-Based Multichannel Wiener Filter

Supervised Monaural Speech Enhancement Using Two-Level Complementary Joint Sparse Representations