Teacher-Student Framework for Polyphonic Semi-supervised Sound Event Detection: Survey and Empirical Analysis

Zhor Diffallah,Hadjer Ykhlef,Hafida Bouarfa
DOI: https://doi.org/10.1145/3660641
IF: 5
2024-04-23
ACM Transactions on Intelligent Systems and Technology
Abstract:Polyphonic sound event detection refers to the task of automatically identifying sound events occurring simultaneously in an auditory scene. Due to the inherent complexity and variability of real-world auditory scenes, building robust detectors for polyphonic sound event detection poses a significant challenge. The task becomes further more challenging without sufficient annotated data to develop sound event detection systems under a supervised learning regime. In this paper, we explore the recent developments in polyphonic sound event detection, with a particular emphasis on the application of Teacher-Student techniques within the semi-supervised learning paradigm. Unlike previous works, we have consolidated and organized the fragmented literature on Teacher-Student techniques for polyphonic sound event detection. By examining the latest research, categorizing Teacher-Student approaches, and conducting an empirical study to assess the performance of each approach, this survey offers valuable insights and practical guidance for researchers and practitioners in the field. Our findings highlight the potential benefits of utilizing multiple learners, ensuring consistent predictions, and making thoughtful choices regarding perturbation strategies.
computer science, information systems, artificial intelligence
What problem does this paper attempt to address?