Abstract:While deep learning has reduced the prevalence of manual feature extraction, transformation of data via feature engineering remains essential for improving model performance, particularly for underwater acoustic signals. The methods by which audio signals are converted into time-frequency representations and the subsequent handling of these spectrograms can significantly impact performance. This work demonstrates the performance impact of using different combinations of time-frequency features in a histogram layer time delay neural network. An optimal set of features is identified with results indicating that specific feature combinations outperform single data features.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: How to improve the performance of underwater acoustic signal classification through the combination of different time - frequency features. Specifically, the authors explored different combinations of multiple time - frequency features in the Histogram Layer Time Delay Neural Network (HLTDNN) to find an optimal feature set, thereby enhancing the classification effect of the Underwater Acoustic Target Recognition (UATR) task. ### Problem Background 1. **Deep Learning and Feature Engineering** - Although deep learning reduces the need for manual feature extraction, in some cases, especially for underwater acoustic signals, feature engineering is still a crucial step in improving model performance. - Audio signals are usually first converted into time - frequency representations (such as spectrograms) and then processed by artificial neural networks. The quality and processing method of these spectrograms have a significant impact on model performance. 2. **Importance of Underwater Acoustic Classification** - Underwater acoustic classification techniques have a wide range of applications in the marine environment, such as biological behavior pattern analysis, search and rescue, seabed mapping, and ship traffic monitoring. - Existing research shows that different feature combinations can significantly affect classification performance, but there has not been a systematic study on feature combination optimization for the HLTDNN model. ### Research Objectives 1. **Explore the Effects of Different Feature Combinations** - Verify the impact of different time - frequency feature combinations on the performance of the HLTDNN model through experiments. - Find the optimal feature combination to improve the classification accuracy of the UATR task. 2. **Introduce New Feature Processing Methods** - Use an adaptive padding layer to enable spectrograms of different sizes to be uniformly input into the model, avoiding information loss. - Capture statistical features in spectrograms through the histogram layer to enhance the model's ability to represent feature distributions. ### Main Contributions 1. **First Study on Feature Combinations on the DeepShip Dataset** - This study is the first to use the HLTDNN model for feature combination research on the DeepShip dataset, filling this gap in the field. 2. **Discover the Optimal Feature Combination** - The experimental results show that the combination of VQT, MFCC, STFT, and GFCC performs best, improving the classification accuracy by approximately 6.83% compared to a single feature (such as MFCC). 3. **Explain the Model Decision - making Process** - Use Explainable AI (XAI) methods, such as FullGrad Class Activation Mapping (FullCAM), to show the specific frequency bands that the model focuses on during the classification process, further verifying the effectiveness of the feature combination. In summary, this paper aims to improve the performance of the HLTDNN model in the underwater acoustic target recognition task through a systematic study of different time - frequency feature combinations, and provides valuable references for future feature selection and model optimization.

Investigation of Time-Frequency Feature Combinations with Histogram Layer Time Delay Neural Networks

Histogram Layer Time Delay Neural Networks for Passive Sonar Classification

Time-reversal detection of multidimensional signals in underwater acoustics

Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks

Histogram Layers for Neural Engineered Features

An Features Extraction and Recognition Method for Underwater Acoustic Target Based on ATCNN

Time-Frequency Feature-Based Underwater Target Detection with Deep Neural Network in Shallow Sea

Time-Frequency Mask Aware Bi-directional LSTM: A Deep Learning Approach for Underwater Acoustic Signal Separation

Underwater acoustic target recognition based on convolutional neural network and multi-feature fusion

A Method for Underwater Acoustic Target Recognition Based on the Delay-Doppler Joint Feature

Drone Detection Method Based on the Time-Frequency Complementary Enhancement Model

Underwater target recognition based on adaptive multi-feature fusion network

Digital audio tampering detection based on spatio-temporal representation learning of electrical network frequency

SFCC: Data Augmentation with Stratified Fourier Coefficients Combination for Time Series Classification

Underwater Acoustic Signal Noise Reduction Based on a Fully Convolutional Encoder-Decoder Neural Network

A Dual-Path Framework with Frequency-and-Time Excited Network for Anomalous Sound Detection

Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

Feature extraction and classification of deep-sea mobile underwater acoustic channels

Deep Learning Aided Time-Frequency Analysis Filter Framework for Suppressing Ionosphere Clutter

Joint Time-Frequency Scattering

A parallel convolutional neural network-transformer model for underwater target recognition based on multimodal feature learning