Abstract:Single-channel speech enhancement is a popular problem in speech enhancement and related fields, but the traditional research direction is to improve the data structure, which always faces the problem of heavily relying on training data sets. This paper proposes and tests a unique method of improving the traditional speech enhancement algorithm based on the features of speech and hearing. This method is designed based on the fundamental frequency (F0) and harmonic features of speech to emphasize the F0 (EFF) like the human auditory system (neural lateral inhibition mechanism). Therefore, it does not depend on the training data and has good scalability, which can be easily embedded in the traditional algorithm. In this paper, the Chinese speech library, the English speech library, and a variety of noise are used in the experiments. When tested, this method can improve the performance of the original algorithm in speech enhancement, especially in the case of low SNR. However, since it is an additional process, therefore the increase in intelligibility of speech may not be high sometimes compared with the increase in perception quality in the case of high SNR, like the STOI scores. But the auditory perception index, like the PESQ scores, are significantly improved, and the WSS scores are reduced as desired. After embedding the algorithm into DNN-based or SNMF-based speech enhancement algorithms, the enhancement in the PESQ scores is improved by about 5% on average, and WSS scores are reduced, while having less negative impact on the increase in STOI in the case of high SNR. The EFF is not related to the training process of the model, but it can improve the PESQ and STOI scores, and lowers the WSS scores. This suggests that fundamental frequency is an important feature in speech processing that affects speech quality. It is necessary to actively introduce fundamental frequency as a feature in speech processing. The proposed algorithm is tested on both the Chinese and English speech datasets for extensive evaluation. The results show significant improvement compared to traditional algorithms.

A Refining Underlying Information Framework for Monaural Speech Enhancement

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

Densely Connected Multi-Stage Model with Channel Wise Subband Feature for Real-Time Speech Enhancement.

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments

Speech enhancement from fused features based on deep neural network and gated recurrent unit network

Speech enhancement based on emphasizing the fundamental frequency integrated with SNMF/DNN

Dual-stream Noise and Speech Information Perception based Speech Enhancement

A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network

A Supervised Speech Enhancement Method for Smartphone-Based Binaural Hearing Aids

A regression approach to speech enhancement based on deep neural networks

A speech enhancement model based on noise component decomposition: Inspired by human cognitive behavior

On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training

Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation

An Iterative Post-processing Approach for Speech Enhancement

Multi-Objective Learning and Mask-Based Post-Processing for Deep Neural Network Based Speech Enhancement

Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance

Speech Enhancement Based On Analysis Synthesis Framework With Improved Pitch Estimation And Spectral Envelope Enhancement

Deep Noise Tracking Network: A Hybrid Signal Processing/Deep Learning Approach to Speech Enhancement

Speech Enhancement Based on Analysis–Synthesis Framework with Improved Parameter Domain Enhancement

Parallel Gated Neural Network With Attention Mechanism For Speech Enhancement