Abstract:We present a benchmark for detecting animal vocalizations in marine passive acoustic data including a published dataset, a concrete task and metrics to report. Three deep learning models for bioacoustics are tested on this benchmark and their performances are reported. Future efforts in the field should aim at the reduction of false‐positive rates in order to assure usability of the automated techniques. Passive acoustic monitoring (PAM) is commonly used to obtain year‐round continuous data on marine soundscapes harboring valuable information on species distributions or ecosystem dynamics. This continuously increasing amount of data requires highly efficient automated analysis techniques in order to exploit the full potential of the available data. Here, we propose a benchmark, which consists of a public dataset, a well‐defined task and evaluation procedure to develop and test automated analysis techniques. This benchmark focuses on the special case of detecting animal vocalizations in a real‐world dataset from the marine realm. We believe that such a benchmark is necessary to monitor the progress in the development of new detection algorithms in the field of marine bioacoustics. We ultimately use the proposed benchmark to test three detection approaches, namely ANIMAL‐SPOT, Koogu and a simple custom sequential convolutional neural network (CNN), and report performances. We report the performance of the three detection approaches in a blocked cross‐validation fashion with 11 site‐year blocks for a multi‐species detection scenario in a large marine passive acoustic dataset. Performance was measured with three simple metrics (i.e., true classification rate, noise misclassification rate and call misclassification rate) and one combined fitness metric, which allocates more weight to the minimization of false positives created by noise. Overall, ANIMAL‐SPOT performed the best with an average fitness metric of 0.6, followed by the custom CNN with an average fitness metric of 0.57 and finally Koogu with an average fitness metric of 0.42. The presented benchmark is an important step to advance in the automatic processing of the continuously growing amount of PAM data that are collected throughout the world's oceans. To ultimately achieve usability of developed algorithms, the focus of future work should be laid on the reduction of the false positives created by noise.

Deep learning algorithm outperforms experienced human observer at detection of blue whale D‐calls: a double‐observer analysis

Deep learning in marine bioacoustics: a benchmark for baleen whale detection

Scaling whale monitoring using deep learning: A human-in-the-loop solution for analyzing aerial datasets

Performance of a Deep Neural Network at Detecting North Atlantic Right Whale Upcalls

Development of deep neural networks for marine mammal call detection using an open-source, user friendly tool

Automated detection of dolphin whistles with convolutional networks and transfer learning

Application to the Call of Southern Right Whale

Machine learning with taxonomic family delimitation aids in the classification of ephemeral beaked whale events in passive acoustic monitoring

North Atlantic Right Whales Up-call Detection Using Multimodel Deep Learning

Deep Machine Learning Techniques for the Detection and Classification of Sperm Whale Bioacoustics

Wavelet-based feature extraction with hidden Markov model classification of Antarctic blue whale sounds

Remote sensing techniques for automated marine mammals detection: a review of methods and current challenges

Machine-Learning Approach for Automatic Detection of Wild Beluga Whales from Hand-Held Camera Pictures

Whale Detection Enhancement through Synthetic Satellite Images

Whale counting in satellite and aerial images with deep learning

ANIMAL-SPOT enables animal-independent signal detection and classification using deep learning

Observational study on the non-linear response of dolphins to the presence of vessels

Antarctic sonobuoy surveys for blue whales from 2006-2021 reveal contemporary distribution, changes over time, and paths to further our understanding of their distribution and biology

Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge

Automated detection of Bornean white-bearded gibbon (Hylobates albibarbis) vocalisations using an open-source framework for deep learning

Auto deep learning for bioacoustic signals