Abstract:This study investigates the potential of automated deep learning to enhance the accuracy and efficiency of multi-class classification of bird vocalizations, compared against traditional manually-designed deep learning models. Using the Western Mediterranean Wetland Birds dataset, we investigated the use of AutoKeras, an automated machine learning framework, to automate neural architecture search and hyperparameter tuning. Comparative analysis validates our hypothesis that the AutoKeras-derived model consistently outperforms traditional models like MobileNet, ResNet50 and VGG16. Our approach and findings underscore the transformative potential of automated deep learning for advancing bioacoustics research and models. In fact, the automated techniques eliminate the need for manual feature engineering and model design while improving performance. This study illuminates best practices in sampling, evaluation and reporting to enhance reproducibility in this nascent field. All the code used is available at https: //github.com/giuliotosato/AutoKeras-bioacustic Keywords: AutoKeras; automated deep learning; audio classification; Wetlands Bird dataset; comparative analysis; bioacoustics; validation dataset; multi-class classification; spectrograms.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the accuracy and efficiency of multi - class bird song classification through automatic deep - learning techniques, especially compared with traditionally manually - designed deep - learning models. Specifically, the research used the Western Mediterranean Wetland Birds dataset to explore the application of AutoKeras, an automated machine - learning framework, in neural network architecture search and hyperparameter tuning. ### Research Background - **Importance of Ecological Monitoring**: Changes in the composition of bird communities and the number of specific species can serve as reliable indicators of the overall health of an ecosystem. - **Limitations of Traditional Methods**: Bioacoustic research has historically relied on experts to manually identify bird species, but with the emergence of machine learning and large - scale audio datasets, automatic sound identification through deep learning has become feasible and popular. - **Challenges**: Building high - precision deep - learning models faces multiple challenges, including the need for a large amount of labeled data, high model complexity, and the need to optimize many architecture design choices (such as neural network topology, hyperparameters), etc. ### Research Objectives - **Advantages of Automatic Deep Learning**: The research aims to verify whether automated deep - learning methods (such as AutoKeras) can outperform traditional manually - designed models (such as MobileNet, ResNet50, and VGG16) and exhibit better performance in multi - class bird song classification tasks. - **Reducing Manual Feature Engineering**: Automation techniques eliminate the need for manual feature engineering while improving performance. - **Best Practices**: The research also provides best practices regarding sampling, evaluation, and reporting to enhance the reproducibility in this emerging field. ### Methods - **Dataset**: Use the Western Mediterranean Wetland Birds dataset, which contains the songs of 20 native bird species. - **Data Pre - processing**: Adopt a stratified sampling method to divide the dataset into training, validation, and test sets, ensuring that each category is appropriately represented in each set. - **Model Comparison**: Compare three pre - trained models (MobileNet V2, VGG16, ResNet50) with the optimal model obtained through AutoKeras search. ### Results - **Superiority of the AutoKeras Model**: The Xception model obtained through AutoKeras search outperforms the three baseline models on the validation and test sets. - **Confusion Matrix Analysis**: All models have misclassification phenomena for certain categories, indicating that these categories may have similar characteristics or lower data quality. ### Discussion - **Importance of Data Pre - processing**: The stratified sampling strategy takes into account the influence of different session lengths and significantly improves the generalization ability and evaluation accuracy of the model. - **Necessity of Model Generalization**: Emphasize the importance of using an independent test set to verify the generalization ability of the model outside the training and validation data. ### Conclusions - **Potential of Automatic Deep Learning**: The research shows that the deep - learning method automated by AutoKeras performs well in multi - class bird song classification tasks, reducing the need for manual design and optimization of neural network models. - **Crucial Role of Data Pre - processing**: A reasonable data pre - processing strategy is crucial for handling multi - class imbalanced datasets and can significantly improve model performance. - **Transparency and Reproducibility**: It is recommended to record the sampling method in detail in the research report and provide complete confusion matrix data to enhance the transparency and reproducibility of the research. This research not only demonstrates the potential of automatic deep learning in bioacoustic research but also provides important methodological guidance for future research.

Auto deep learning for bioacoustic signals

Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture

Deep learning for detection of bird vocalisations

Automated detection of dolphin whistles with convolutional networks and transfer learning

Deep learning in marine bioacoustics: a benchmark for baleen whale detection

Deep Learning Approach to Classification of Acoustic Signals Using Information Features

Western Mediterranean wetlands bird species classification: evaluating small-footprint deep learning approaches on a new annotated dataset

Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge

Towards Deep Active Learning in Avian Bioacoustics

Global birdsong embeddings enable superior transfer learning for bioacoustic classification

Advanced Framework for Animal Sound Classification With Features Optimization

Automatic bioacoustics noise reduction method based on a deep feature loss network

Automated Bioacoustic Monitoring for South African Bird Species on Unlabeled Data

Automated detection of Bornean white-bearded gibbon (Hylobates albibarbis) vocalizations using an open-source framework for deep learning

Automatic detection for bioacoustic research: a practical guide from and for biologists and computer scientists

Automated detection of Bornean white-bearded gibbon (Hylobates albibarbis) vocalisations using an open-source framework for deep learning

Ensemble deep learning and anomaly detection framework for automatic audio classification: Insights into deer vocalizations

Deep Active Audio Feature Learning in Resource-Constrained Environments

An Auto Encoder For Audio Dolphin Communication

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks

ANIMAL-SPOT enables animal-independent signal detection and classification using deep learning