Neural Network-based Acoustic Vehicle Counting

Slobodan Djukanović,Yash Patel,Jiři Matas,Tuomas Virtanen
DOI: https://doi.org/10.48550/arXiv.2010.11659
2021-03-27
Abstract:This paper addresses acoustic vehicle counting using one-channel audio. We predict the pass-by instants of vehicles from local minima of clipped vehicle-to-microphone distance. This distance is predicted from audio using a two-stage (coarse-fine) regression, with both stages realised via neural networks (NNs). Experiments show that the NN-based distance regression outperforms by far the previously proposed support vector regression. The $ 95\% $ confidence interval for the mean of vehicle counting error is within $[0.28\%, -0.55\%]$. Besides the minima-based counting, we propose a deep learning counting that operates on the predicted distance without detecting local minima. Although outperformed in accuracy by the former approach, deep counting has a significant advantage in that it does not depend on minima detection parameters. Results also show that removing low frequencies in features improves the counting performance.
Sound,Machine Learning,Audio and Speech Processing
What problem does this paper attempt to address?