Complex data labeling with deep learning methods: Lessons from fisheries acoustics

J.M.A.Sarr,T. Brochier,P.Brehmer,Y.Perrot,A.Bah,A.Sarré,M.A.Jeyid,M.Sidibeh,S.El Ayoub
DOI: https://doi.org/10.1016/j.isatra.2020.09.018
2020-10-21
Abstract:Quantitative and qualitative analysis of acoustic backscattered signals from the seabed bottom to the sea surface is used worldwide for fish stocks assessment and marine ecosystem monitoring. Huge amounts of raw data are collected yet require tedious expert labeling. This paper focuses on a case study where the ground truth labels are non-obvious: echograms labeling, which is time-consuming and critical for the quality of fisheries and ecological analysis. We investigate how these tasks can benefit from supervised learning algorithms and demonstrate that convolutional neural networks trained with non-stationary datasets can be used to stress parts of a new dataset needing human expert correction. Further development of this approach paves the way toward a standardization of the labeling process in fisheries acoustics and is a good case study for non-obvious data labeling processes.
Machine Learning,Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in fishery acoustics, how to use supervised learning algorithms to automate or partially automate the complex process of underwater echogram annotation, especially the bottom line correction, which is a time - consuming task and crucial to the quality of fish stock assessment. Specifically, the paper focuses on how to train Convolutional Neural Networks (CNNs) to identify the parts in new datasets that require correction by human experts, thereby reducing the time experts spend on data processing and improving the consistency and accuracy of bottom line correction. In addition, the paper also explores the effect of cross - domain training techniques, that is, using datasets from different ocean surveys in a mixed way for training to improve the performance of the model during testing. This not only helps standardize the annotation process in fishery acoustics, but also provides a good case study for other non - obvious data annotation processes.