Abstract:Speech activity detection aims to distinguish the speech/non-speech sections in audio data. This technology had been widely used in the scene of speech recognition, speech enhancement and speaker diarization, where most of them adopted methods of multiple threshold, reducing noise, Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) as the state-of-the-art. As the front-end of these applications, the precision of speech activity detection and speaker localization will serious impact the overall system performance. But how to conquer the interference caused by indoor reverberation and environmental noise is still the bottleneck of improving the accuracy of detection by single channel. Distributed microphones are integrated with scattered microphones in the same room or space and each microphone has its own device to collect data. It can utilize the time delay of sound source to depress the interference of non-speech signals and has no prior request on location or synchronism which is strictly regulated in microphone array. For its convenience, distributed microphones system is being increasingly applied in smart home, vehicle hands-free communication and monitoring. In this paper, a method of enhanced Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) based on distributed microphones is proposed and compared with the same method on single channel. In several distributed microphones datasets, the novel method has the best twenty-four percent and eighteen percent increase in terms of precision and recall of detection. At the same time, the correct rate of 3D-coordinate speaker localization has been proved to go up thirty present than before.

Architecture Analysis and Realization of Distributed Speech Recognition System

3D Audio Rendering in Distributed Virtual Environment

Design and implementation of a speaker recognition system

Research on Speaker-Depended Isolated-Word Speech Recognition System

The Design of MDF Remote Intelligent Supervisor System with S3C44B0X

Design and Implementation of Distributed Raster Spatial Database Engine

Real-Time Speech Recognition Method for Embedded System

Design and implementation of real-time telephone speech recognition system using DSP TMS320C31

Data Acquisition and Distributed Computing System Based on Embedded Network

Embedded Stream Media Servers Building Based on SIP and Multi-Core Architecture

Research and Implementation of Intelligent Server Group

Digital Speech Recognition Based on DDBHMM

Study of speech recognition system based on private automatic branch exchange

A Noise Robust Front End Algorithm for Mandarin Speech Recognition and Performance Analysis

A Deep Analysis of Speech Separation Guided Diarization Under Realistic Conditions

Study on Hierarchical Speech Recognition

Speech Selection and Environmental Adaptation for Asynchronous Speech Recognition

On the Implementation of DMR System Based on SDR Structure

A Speech Recognition System Based on a Hybrid HMM/SVM Architecture

Speech Activity Detection and Speaker Localization Based on Distributed Microphones.

Decision Support System Based on Remote Data Collection