Abstract:Heart and lung sounds are crucial for healthcare monitoring. Recent improvements in stethoscope technology have made it possible to capture patient sounds with enhanced precision. In this dataset, we used a digital stethoscope to capture both heart and lung sounds, including individual and mixed recordings. To our knowledge, this is the first dataset to offer both separate and mixed cardiorespiratory sounds. The recordings were collected from a clinical manikin, a patient simulator designed to replicate human physiological conditions, generating clean heart and lung sounds at different body locations. This dataset includes both normal sounds and various abnormalities (i.e., murmur, atrial fibrillation, tachycardia, atrioventricular block, third and fourth heart sound, wheezing, crackles, rhonchi, pleural rub, and gurgling sounds). The dataset includes audio recordings of chest examinations performed at different anatomical locations, as determined by specialist nurses. Each recording has been enhanced using frequency filters to highlight specific sound types. This dataset is useful for applications in artificial intelligence, such as automated cardiopulmonary disease detection, sound classification, unsupervised separation techniques, and deep learning algorithms related to audio signal processing.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to provide a high - quality and diverse cardiopulmonary sound dataset to support research in artificial intelligence and machine learning in areas such as cardiopulmonary disease detection, sound classification, and signal processing. Specifically: 1. **Lack of high - quality datasets**: Currently, there are few available cardiopulmonary sound datasets, and most datasets only contain separate heart or lung sounds and lack mixed recordings. This limits the training and validation effectiveness of machine - learning models. 2. **Diversity and accuracy**: Existing datasets are not comprehensive enough in covering cardiopulmonary abnormal types and cannot fully reflect the actual clinical situation. In addition, the quality of many datasets is not high, which may affect the performance of the model. 3. **Importance of mixed recordings**: In the actual clinical environment, heart and lung sounds usually occur simultaneously. Therefore, mixed recordings are very important for developing unsupervised separation algorithms (such as blind source separation) and are helpful for analyzing naturally overlapping cardiopulmonary sounds. To solve these problems, the authors used a clinical manikin to record heart and lung sounds, including separate and mixed recordings, through a digital stethoscope. These recordings cover a variety of normal and abnormal cardiopulmonary sound types and were collected at different anatomical locations. In addition, all recordings were carried out in a controlled, noise - free environment, ensuring the high - quality and reliability of the data. ### Specific contributions: - **Diversity**: Provide recordings of 50 heart sounds, 50 lung sounds, and 110 mixed sounds, covering a variety of normal and abnormal cardiopulmonary sound types. - **High - quality**: Recorded using an advanced digital stethoscope with a high sampling rate, active noise reduction, and built - in frequency filtering functions to ensure recording quality. - **Mixed recordings**: For the first time, simultaneously recorded cardiopulmonary mixed sounds are provided, which are very valuable for studying naturally overlapping cardiopulmonary sounds. - **Wide application**: Applicable to a variety of machine - learning tasks such as supervised learning (such as classification algorithms) and unsupervised learning (such as clustering and blind source separation). Through this dataset, researchers can better develop and validate AI algorithms for cardiopulmonary disease detection and audio - signal processing.

Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital Stethoscope

VoxMed: One-Step Respiratory Disease Classifier using Digital Stethoscope Sounds

Development of an Electronic Stethoscope and a Classification Algorithm for Cardiopulmonary Sounds

Dataset of raw and pre-processed speech signals, Mel Frequency Cepstral Coefficients of Speech and Heart Rate measurements

Sound-Dr: Reliable Sound Dataset and Baseline Artificial Intelligence System for Respiratory Illnesses

Respiratory Disease Classification and Biometric Analysis Using Biosignals from Digital Stethoscopes

RespiroDynamics: A Multifaceted Dataset for Enhanced Lung Health Assessment Using Deep Learning

Multimedia Respiratory Database (RespiratoryDatabase@TR): Auscultation Sounds and Chest X-rays

An open auscultation dataset for machine learning-based respiratory diagnosis studies

BUET Multi-disease Heart Sound Dataset: A Comprehensive Auscultation Dataset for Developing Computer-Aided Diagnostic Systems

Automated Heart and Lung Auscultation in Robotic Physical Examinations

A Deep Learning Algorithm for Automated Cardiac Murmur Detection Via a Digital Stethoscope Platform

StethAid: A Digital Auscultation Platform for Pediatrics

Deep learning-based lung sound analysis for intelligent stethoscope

[Artificial Intelligence Technology in Cardiac Auscultation Screening for Congenital Heart Disease: Present and Future].

Coswara - A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis

Comprehensive mm-Wave FMCW Radar Dataset for Vital Sign Monitoring: Embracing Extreme Physiological Scenarios

BRACETS: Bimodal repository of auscultation coupled with electrical impedance thoracic signals

Health Monitoring via Heart, Breath, and Korotkoff Sounds by Wearable Piezoelectret Patches

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection

Exploring Sensing Devices for Heart and Lung Sound Monitoring