Abstract:Unseen noise estimation is a key yet challenging step to make a speech enhancement algorithm work in adverse environments. At worst, the only prior knowledge we know about the encountered noise is that it is different from the involved speech. Therefore, by subtracting the components which cannot be adequately represented by a well defined speech model, the noises can be estimated and removed. Given the good performance of deep learning in signal representation, a deep auto encoder (DAE) is employed in this work for accurately modeling the clean speech spectrum. In the subsequent stage of speech enhancement, an extra DAE is introduced to represent the residual part obtained by subtracting the estimated clean speech spectrum (by using the pre-trained DAE) from the noisy speech spectrum. By adjusting the estimated clean speech spectrum and the unknown parameters of the noise DAE, one can reach a stationary point to minimize the total reconstruction error of the noisy speech spectrum. The enhanced speech signal is thus obtained by transforming the estimated clean speech spectrum back into time domain. The above proposed technique is called separable deep auto encoder (SDAE). Given the under-determined nature of the above optimization problem, the clean speech reconstruction is confined in the convex hull spanned by a pre-trained speech dictionary. New learning algorithms are investigated to respect the non-negativity of the parameters in the SDAE. Experimental results on TIMIT with 20 noise types at various noise levels demonstrate the superiority of the proposed method over the conventional baselines.

Robust Sound Event Classification by Using Denoising Autoencoder

Robust Polyphonic Sound Event Detection by Using Multi Frame Size Denoising Autoencoder

Robust sound event classification using deep neural networks

A Label Noise Robust Stacked Auto-Encoder Algorithm for Inaccurate Supervised Classification Problems

DENOISPEECH: DENOISING TEXT TO SPEECH WITH FRAME-LEVEL NOISE MODELING

Robust Audio Sensing with Multi-Sound Classification.

MFCC combined with sparse coding for sound event classification under different noise environments

Ultrasonic signal denoising based on autoencoder.

Robustness of Neural Architectures for Audio Event Detection

Robust Sound Event Classification with Bilinear Multi-Column ELM-AE and Two-Stage Ensemble Learning

Deep Learning Applied to Dereverberation and Sound Event Classification in Reverberant Environments

Multilayered convolutional neural network-based auto-CODEC for audio signal denoising using mel-frequency cepstral coefficients

DENet: a deep architecture for audio surveillance applications

Environmental Noise Reduction based on Deep Denoising Autoencoder

Denoising Auto-Encoders Toward Robust Unsupervised Feature Representation.

Sparse Coding for Sound Event Classification

Unseen Noise Estimation Using Separable Deep Auto Encoder for Speech Enhancement

A Denoising Method Based on DDPM for Radar Emitter Signal Intra-Pulse Modulation Classification

Wiener filtering based speech enhancement with Weighted Denoising Auto-encoder and noise classification

Spectral Denoising for Microphone Classification

Sound Event Detection for Human Safety and Security in Noisy Environments