Towards the topology of autoencoder of calls versus clicks of marine mammal
Vincent Roger,Maxence Ferrari,Ricard Marxer,Faicel Chamroukhi,Herve Glotin
DOI: https://doi.org/10.1121/1.5067859
2018-09-01
The Journal of the Acoustical Society of America
Abstract:The goal is to learn the features and the representation adapted for cetacean sound dynamics without any priors. Thus, we develop data driven model to generate voicing and click of cetaceans audio signals. We learn representation and features of stationary or nonstationary emission using neural network from raw audio. We use different types of convolutions (causal, with strides, with dilation [1]), or gradient inversion [2]. Experiments are conducted on various kind of calls of humpback whales from nips4b challenge [3] or Orca whale. We compare the topology for transient encoding on Physeters and Inia g. For each model, we detail the resulting filters and discuss on the topology. We acknowledge Region PACA and NortekMED for Roger’s Phd grant, & DGA and Région Haut de France for Ferrari’s Phd grant. [1] Oord, Dieleman, Zen, Simonyan, Vinyals, Graves et al. Wavenet : A generative model for raw audio, arXiv:1609.03499, 2016 [2] Balestriero, Roger, Glotin, Baraniuk, Semi-Supervised Learning via New Deep Network Inversion, arXiv:1711.04313, 2017 [3] Glotin, LeCun, Mallat et al. Proc. 1st wkp on Neural Information Processing for Bioacoustics NIPS4B, joint to NIPS Alberta USA, 2013 http://sabiod.org/nips4b/challenge2.html, http://sabiod.org/NIPS4B2013_book.pdf
acoustics,audiology & speech-language pathology