Abstract:The classification of infrasound events has considerable importance in improving the capability to identify the types of natural disasters. The traditional infrasound classification mainly relies on machine learning algorithms after artificial feature extraction. However, guaranteeing the effectiveness of the extracted features is difficult. The current trend focuses on using a convolution neural network to automatically extract features for classification. This method can be used to extract signal spatial features automatically through a convolution kernel; however, infrasound signals contain not only spatial information but also temporal information when used as a time series. These extracted temporal features are also crucial. If only a convolution neural network is used, then the time dependence of the infrasound sequence will be missed. Using long short-term memory networks can compensate for the missing time-series features but induces spatial feature information loss of the infrasound signal. A multiscale squeeze excitation-convolution neural network-bidirectional long short-term memory network infrasound event classification fusion model is proposed in this study to address these problems. This model automatically extracted temporal and spatial features, adaptively selected features, and also realized the fusion of the two types of features. Experimental results showed that the classification accuracy of the model was more than 98%, thus verifying the effectiveness and superiority of the proposed model.

A research for sound event localization and detection based on local–global adaptive fusion and temporal importance network

Specialty may be better: A decoupling multi-modal fusion network for Audio-visual event localization

Polyphonic sound event localization and detection based on Multiple Attention Fusion ResNet

Joint Spatio-Temporal-Frequency Representation Learning for Improved Sound Event Localization and Detection

Deep and CNN Fusion Method for Binaural Sound Source Localisation

Hierarchical-Concatenate Fusion TDNN for sound event classification

The Solution for Temporal Sound Localisation Task of ICCV 1st Perception Test Challenge 2023

Decoupling Temporal Convolutional Networks Model in Sound Event Detection and Localization

Sound source localization method based time-domain signal feature using deep learning

Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection

Polyphonic sound event localization and detection using channel-wise FusionNet

Research on Environmental Sound Recognition Method based on Residual Network with Dual Attention Mechanism

MTF-CRNN: Multiscale Time-Frequency Convolutional Recurrent Neural Network for Sound Event Detection.

Infrasound Event Classification Fusion Model Based on Multiscale SE-CNN and BiLSTM

MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection

A Generalized Network Based on Multi-Scale Densely Connection and Residual Attention for Sound Source Localization and Detection.

A hybrid parametric-deep learning approach for sound event localization and detection

Audio-Visual Event Localization by Learning Spatial and Semantic Co-attention

Active Object Discovery and Localization Using Sound-Induced Attention

MULTI-SCALE CONVOLUTION BASED ATTENTION NETWORK FOR SEMI-SUPERVISED SOUND EVENT DETECTION Technical Report

Continuous Emotion Recognition with Audio-visual Leader-follower Attentive Fusion