Abstract:In this paper, we propose a novel single-channel speech enhancement algorithm that applies dual-domain transforms comprising of dual-tree complex wavelet transform (DTCWT) and short-time Fourier transform (STFT) with a sparse non-negative matrix factorization (SNMF). The first domain belongs to the DTCWT, which is utilized on the time domain signals to conquer the weakness of signal distortions brought about by the downsampling of the discrete wavelet packet transform (DWPT) and delivered a set of subband signals. The second domain alludes to the STFT, which is exploited to each subband signal and built a complex spectrogram. At last, we apply the SNMF to the magnitude spectrogram for extracting speech components. In short, the DTCWT decomposes the time-domain noisy signal into a set of subband signals and afterward applied STFT to each subband signal, and we get nonnegative matrices by taking the absolute value of the complex matrix. From this point forward, we apply SNMF to each nonnegative matrix and identify the speech components. Finally, the estimated signal can be achieved through a subband binary ratio mask (SBRM) by applying the inverse STFT (ISTFT) and, subsequently, the inverse DTCWT (IDTCWT). The proposed approach is assessed utilizing the GRID audio-visual and IEEE databases, and diverse kinds of noises such as stationary, non-stationary, and quasi-stationary. The exploratory outcomes demonstrate that the proposed algorithm improved objective speech quality and intelligibility altogether at all considered signal to noise ratios (SNRs), compared to the other seven speech enhancement methods of STFT-SNMF, STFT-SNMFSE, MLD-STFT-SNMF, STFT-GDL, STFT-CJSR, DTCWT-SNMF, and DWPT-STFT-SNMF.

Single Channel Speech Enhancement Using Outlier Detection

Speech Enhancement for Nonstationary Noise Environments

A Novel Speech Enhancement Algorithm Based on Data Field

Supervised Single-Channel Speech Dereverberation And Denoising Using A Two-Stage Processing

Distortionless Multi-Channel Target Speech Enhancement for Overlapped Speech Recognition

Online Monaural Speech Enhancement Based on Periodicity Analysis and A Priori SNR Estimation

Speech Enhancement for Non-Stationary Noise Environments

Supervised Single-Channel Speech Dereverberation and Denoising Using a Two-Stage Model Based Sparse Representation.

Supervised Single Channel Dual Domains Speech Enhancement Using Sparse Non-Negative Matrix Factorization

Noise Reduction Using Sparsity Constrained and Regularized Iterative Thresholding Algorithm and Dictionary

Densely Connected Multi-Stage Model with Channel Wise Subband Feature for Real-Time Speech Enhancement.

Speech Enhancement Algorithm Based on Spectral Subtraction

Adaptive two-channel speech enhancement algorithm based on the modulation spectrum

Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model

Speech Enhancement by Denoising and Dereverberation Using a Generalized Sidelobe Canceller-Based Multichannel Wiener Filter

Enhancement Algorithm for Low Signal to Noise Ratio Speech

Speech Enhancement Based On Dynamic Noise Estimation Within Auto-Correlation Domain

Unsupervised speech enhancement with spectral kurtosis and double deep priors

Multiframe Maximum Likelihood Distortionless Response Filter for Single-Channel Speech Enhancement

Speech Enhancement Algorithm Using Wiener Filtering Based on Improved Energy to Entropy Ratio

A Higher Order Subspace Algorithm for Multichannel Speech Enhancement