Abstract:Wireless acoustic sensor networks are nowadays an essential tool for noise pollution monitoring and managing in cities. The increased computing capacity of the nodes that create the network is allowing the addition of processing algorithms and artificial intelligence that provide more information about the sound sources and environment, e.g., detect sound events or calculate loudness. Several models to predict sound pressure levels in cities are available, mainly road, railway and aerial traffic noise. However, these models are mostly based in auxiliary data, e.g., vehicles flow or street geometry, and predict equivalent levels for a temporal long-term. Therefore, forecasting of temporal short-term sound levels could be a helpful tool for urban planners and managers. In this work, a Long Short-Term Memory (LSTM) deep neural network technique is proposed to model temporal behavior of sound levels at a certain location, both sound pressure level and loudness level, in order to predict near-time future values. The proposed technique can be trained for and integrated in every node of a sensor network to provide novel functionalities, e.g., a method of early warning against noise pollution and of backup in case of node or network malfunction. To validate this approach, one-minute period equivalent sound levels, captured in a two-month measurement campaign by a node of a deployed network of acoustic sensors, have been used to train it and to obtain different forecasting models. Assessments of the developed LSTM models and Auto regressive integrated moving average models were performed to predict sound levels for several time periods, from 1 to 60 min. Comparison of the results show that the LSTM models outperform the statistics-based models. In general, the LSTM models achieve a prediction of values with a mean square error less than 4.3 dB for sound pressure level and less than 2 phons for loudness. Moreover, the goodness of fit of the LSTM models and the behavior pattern of the data in terms of prediction of sound levels are satisfactory.

Recurrent Neural Networks and Acoustic Features for Frame-Level Signal-to-Noise Ratio Estimation.

Frame-Level Signal-to-Noise Ratio Estimation Using Deep Learning

Frame Stacking and Retaining for Recurrent Neural Network Acoustic Model

Learning Frame-Level Recurrent Neural Networks Representations for Query-by-Example Spoken Term Detection on Mobile Devices

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments

A New Real-Time Noise Suppression Algorithm for Far-Field Speech Communication Based on Recurrent Neural Network

A Progressive Learning Approach to Adaptive Noise and Speech Estimation for Speech Enhancement and Noisy Speech Recognition.

Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR

Automated Call Detection for Acoustic Surveys with Structured Calls of Varying Length

Towards Efficient Recurrent Architectures: A Deep LSTM Neural Network Applied to Speech Enhancement and Recognition

Sound Levels Forecasting in an Acoustic Sensor Network Using a Deep Neural Network

Advanced Recurrent Network-Based Hybrid Acoustic Models for Low Resource Speech Recognition

Deep Recurrent Neural Networks for Acoustic Modelling

Recurrent Neural Network Based Link Quality Prediction for Wireless Sensor Networks

Self-attending RNN for Speech Enhancement to Improve Cross-corpus Generalization

Recurrent Neural Networks with Stochastic Layers for Acoustic Novelty Detection

Deep causal speech enhancement and recognition using efficient long-short term memory Recurrent Neural Network

FRCRN: Boosting Feature Representation Using Frequency Recurrence for Monaural Speech Enhancement

Deep Long Short-Term Memory Adaptive Beamforming Networks For Multichannel Robust Speech Recognition

Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition

Acoustic characterization of speech rhythm: going beyond metrics with recurrent neural networks