Sound Event Detection Based on Mel Spectral Envelope Estimation and Regression Detection

Maocun Tian,Ruwei Li,Weidong An
DOI: https://doi.org/10.1109/icspcc59353.2023.10400273
2023-01-01
Abstract:Binary metrics is employed in traditional deep learning methods of sound event detection(SED) to determine the presence or absence of an event. However, these binary activity metrics inadequately characterize the nuanced states of events, which limiting the performance of current detection algorithms, particularly in scenarios involving event overlaps. Concurrently, conventional sound event detection algorithms suffer from sluggish detection speeds, resulting in substantial temporal costs. To solve the above problems, a novel sound event detection algorithm based on amplitude envelope estimation and regression detection(EERD) is proposed in this paper. In this algorithm, firstly the envelope of the Mel Frequency Cepstrum Coefficient(MFCC) of the audio signal is estimatied, thereby enhanced information concerning the sound events is obtained. Secondly, the regression-based detection is introduced into the network model, so that the algorithm's reliance on post-processing is reduced and concomitantly the detection speed is improved. Empirical validation is conducted on the TUT sound event detection dataset. Experiments show that the algorithm proposed in this paper attains superior F-measure for sound event detection contrast to benchmark algorithms, hence heightened detection performance is substantiated. At the same time, detection speed is achieved at least sixfold faster in the proposed algorithm than the conventional segmentation-by-class approach.
What problem does this paper attempt to address?