Abstract:Speech activity detection aims to distinguish the speech/non-speech sections in audio data. This technology had been widely used in the scene of speech recognition, speech enhancement and speaker diarization, where most of them adopted methods of multiple threshold, reducing noise, Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) as the state-of-the-art. As the front-end of these applications, the precision of speech activity detection and speaker localization will serious impact the overall system performance. But how to conquer the interference caused by indoor reverberation and environmental noise is still the bottleneck of improving the accuracy of detection by single channel. Distributed microphones are integrated with scattered microphones in the same room or space and each microphone has its own device to collect data. It can utilize the time delay of sound source to depress the interference of non-speech signals and has no prior request on location or synchronism which is strictly regulated in microphone array. For its convenience, distributed microphones system is being increasingly applied in smart home, vehicle hands-free communication and monitoring. In this paper, a method of enhanced Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) based on distributed microphones is proposed and compared with the same method on single channel. In several distributed microphones datasets, the novel method has the best twenty-four percent and eighteen percent increase in terms of precision and recall of detection. At the same time, the correct rate of 3D-coordinate speaker localization has been proved to go up thirty present than before.

Model-based distributed node clustering and multi-speaker speech presence probability estimation in wireless acoustic sensor networks

Model-Based Voice Activity Detection in Wireless Acoustic Sensor Networks.

Distributed Spatial Correlation-Based Clustering for Approximate Data Collection in Wsns

Hierarchical Spatial Clustering in Multi-Hop Wireless Sensor Networks

Distributed Energy-Saving Speech Enhancement in Wireless Acoustic Sensor Networks

Distributed Sensor Selection for Speech Enhancement With Acoustic Sensor Networks

On Sparse Bayesian Spreading Function Estimation Based Iterative Detection in Multiple-Input Multiple-Output Underwater Acoustic Communications

A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability As the Secondary Training Target

Wave-domain active noise control over distributed networks of multi-channel nodes *

Parallel processing of distributed beamforming and multichannel linear prediction for speech denoising and deverberation in wireless acoustic sensor networks

Distributed Speech Dereverberation Using Weighted Prediction Error

Multi-task single channel speech enhancement using speech presence probability as a secondary task training target

Speech Activity Detection and Speaker Localization Based on Distributed Microphones.

Smoothed Frame-Level SINR and Its Estimation for Sensor Selection in Distributed Acoustic Sensor Networks

Distributed, Robust Acoustic Source Localization in a Wireless Sensor Network

One-Shot Distributed Node-Specific Signal Estimation with Non-Overlapping Latent Subspaces in Acoustic Sensor Networks

Distributed speech separation in spatially unconstrained microphone arrays

Decentralized Robust Acoustic Source Localization with Wireless Sensor Networks for Heavy-Tail Distributed Observations

Distributed Marginalized Auxiliary Particle Filter for Speaker Tracking in Distributed Microphone Networks

A Joint Model and Data Driven Method for Distributed Estimation

DNN-based mask estimation for distributed speech enhancement in spatially unconstrained microphone arrays