Abstract:In the field of underwater acoustic recognition, machine learning methods rely on a large number of datasets to achieve high accuracy, while the actual collected signal samples are often very scarce, which has a great impact on the recognition performance. This paper presents a recognition method of an underwater acoustic target by the data augmentation technique and the residual convolutional neural network (CNN) model, which is used to expand training samples to improve recognition performance. As a representative model in residual CNN, the ResNet18 model is used for recognition. The whole process mainly includes mel-frequency cepstral coefficient (MFCC) feature extraction, data augmentation processing, and ResNet18 model recognition. On the base of the traditional data augmentation, this study used the deep convolutional generative adversarial network (DCGAN) model to realize the expansion of underwater acoustic samples and compared the recognition performance of support vector machine (SVM), common CNN, VGG19, and ResNet18. The recognition results of the MFCC, constant Q transform (CQT), and low-frequency analyzer and recorder (LOFAR) spectrum were also analyzed and compared. Experimental results showed that the recognition accuracy of the MFCC feature was better than that of other features at the same method, and using the data augmentation method could obviously improve the recognition performance. Moreover, the recognition performance of ResNet18 using data enhancement technology was better than that of other models, which was due to the combination of the data expansion advantage of data augmentation technology and the deep feature extracting ability of the residual CNN model. In addition, although this method was used for ship recognition in this paper, it is not limited to this. This method is also applicable to other target voice recognition, such as natural sound and underwater voice biometrics.

Acoustic data augmentation for small passive acoustic monitoring datasets

Data augmentation approaches for improving animal audio classification

Data augmentation for the classification of North Atlantic right whales upcalls

Investigation of Data Augmentation Techniques in Environmental Sound Recognition

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

Generative AI-based data augmentation for improved bioacoustic classification in noisy environments

Adaptive data augmentation for mandarin automatic speech recognition

Data Augmentation based Convolutional Neural Network for Auscultation

Data augmentation method for underwater acoustic target recognition based on underwater acoustic channel modeling and transfer learning

Classification of animal sounds in a hyperdiverse rainforest using Convolutional Neural Networks

A CNN Sound Classification Mechanism Using Data Augmentation

Metric Learning Based Data Augmentation for Environmental Sound Classification.

Spectral images based environmental sound classification using CNN with meaningful data augmentation

Towards small and accurate convolutional neural networks for acoustic biodiversity monitoring

CNN-RNN and Data Augmentation Using Deep Convolutional Generative Adversarial Network for Environmental Sound Classification

Underwater Acoustic Target Recognition Based on Data Augmentation and Residual CNN

Generative Deep Learning and Signal Processing for Data Augmentation of Cardiac Auscultation Signals: Improving Model Robustness Using Synthetic Audio

Data Augmentation for Diverse Voice Conversion in Noisy Environments

Ensemble Augmentation for Deep Neural Networks Using 1-D Time Series Vibration Data

Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition

Data Augmentation of Room Classifiers using Generative Adversarial Networks