ICSD: An Open-source Dataset for Infant Cry and Snoring Detection

Qingyu Liu,Longfei Song,Dongxing Xu,Yanhua Long
2024-08-20
Abstract:The detection and analysis of infant cry and snoring events are crucial tasks within the field of audio signal processing. While existing datasets for general sound event detection are plentiful, they often fall short in providing sufficient, strongly labeled data specific to infant cries and snoring. To provide a benchmark dataset and thus foster the research of infant cry and snoring detection, this paper introduces the Infant Cry and Snoring Detection (ICSD) dataset, a novel, publicly available dataset specially designed for ICSD tasks. The ICSD comprises three types of subsets: a real strongly labeled subset with event-based labels annotated manually, a weakly labeled subset with only clip-level event annotations, and a synthetic subset generated and labeled with strong annotations. This paper provides a detailed description of the ICSD creation process, including the challenges encountered and the solutions adopted. We offer a comprehensive characterization of the dataset, discussing its limitations and key factors for ICSD usage. Additionally, we conduct extensive experiments on the ICSD dataset to establish baseline systems and offer insights into the main factors when using this dataset for ICSD research. Our goal is to develop a dataset that will be widely adopted by the community as a new open benchmark for future ICSD research.
Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the field of audio signal processing, there is a lack of sufficient and publicly available datasets with detailed annotations specifically for infant cry and snore sound event detection. Although the existing general sound event detection datasets are abundant, they often fail to provide enough precisely - annotated data to meet the specific requirements for infant cry and snore detection. Therefore, this paper introduces the "Infant Cry and Snore Detection (ICSD)" dataset, which is a new, publicly available dataset aiming to fill this gap and promote the research and development in related fields. ### Main problems: 1. **Insufficient datasets**: The existing datasets are either too small in scale or not suitable for the detection tasks of infant cry and snore. 2. **Inadequate annotations**: Many existing datasets only provide weak annotations or no detailed event timestamps, which are not sufficient for accurate event detection. 3. **Complex application scenarios**: The sound environments of infant cry and snore are complex, and a dataset that can reflect real - world scenarios is required to train and test models. ### Solutions: - **Create the ICSD dataset**: The ICSD dataset contains three subsets: - **Strongly - annotated subset**: It contains manually - annotated event timestamps. - **Weakly - annotated subset**: Only has event annotations at the segment level. - **Synthetic subset**: Generated and with strong annotations. - **Describe the dataset in detail**: The paper describes in detail the creation process of the ICSD dataset, including the challenges encountered and the solutions, and discusses the limitations and key factors of the dataset. - **Baseline systems**: Three baseline systems are established on the ICSD dataset, two of which are based on the baseline systems of the DCASE Task 4 Challenge, and the third is a new method proposed by the team. - **Experiments and analysis**: Extensive experiments are carried out, baseline systems are established, and a detailed analysis of the results and challenges is provided to promote future research. ### Goals: - **Provide a benchmark dataset**: It is hoped that the ICSD dataset can become a new open benchmark widely adopted by the community and promote the research progress in the field of infant cry and snore detection. - **Promote research development**: By providing detailed dataset descriptions and baseline systems, it provides references for researchers and promotes further research in this field. Through these measures, the paper aims to solve the current problem of insufficient datasets and provide strong support for the research on infant cry and snore detection.