Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital Stethoscope

Yasaman Torabi,Shahram Shirani,James P. Reilly
2024-10-04
Abstract:Heart and lung sounds are crucial for healthcare monitoring. Recent improvements in stethoscope technology have made it possible to capture patient sounds with enhanced precision. In this dataset, we used a digital stethoscope to capture both heart and lung sounds, including individual and mixed recordings. To our knowledge, this is the first dataset to offer both separate and mixed cardiorespiratory sounds. The recordings were collected from a clinical manikin, a patient simulator designed to replicate human physiological conditions, generating clean heart and lung sounds at different body locations. This dataset includes both normal sounds and various abnormalities (i.e., murmur, atrial fibrillation, tachycardia, atrioventricular block, third and fourth heart sound, wheezing, crackles, rhonchi, pleural rub, and gurgling sounds). The dataset includes audio recordings of chest examinations performed at different anatomical locations, as determined by specialist nurses. Each recording has been enhanced using frequency filters to highlight specific sound types. This dataset is useful for applications in artificial intelligence, such as automated cardiopulmonary disease detection, sound classification, unsupervised separation techniques, and deep learning algorithms related to audio signal processing.
Audio and Speech Processing,Artificial Intelligence,Machine Learning,Signal Processing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to provide a high - quality and diverse cardiopulmonary sound dataset to support research in artificial intelligence and machine learning in areas such as cardiopulmonary disease detection, sound classification, and signal processing. Specifically: 1. **Lack of high - quality datasets**: Currently, there are few available cardiopulmonary sound datasets, and most datasets only contain separate heart or lung sounds and lack mixed recordings. This limits the training and validation effectiveness of machine - learning models. 2. **Diversity and accuracy**: Existing datasets are not comprehensive enough in covering cardiopulmonary abnormal types and cannot fully reflect the actual clinical situation. In addition, the quality of many datasets is not high, which may affect the performance of the model. 3. **Importance of mixed recordings**: In the actual clinical environment, heart and lung sounds usually occur simultaneously. Therefore, mixed recordings are very important for developing unsupervised separation algorithms (such as blind source separation) and are helpful for analyzing naturally overlapping cardiopulmonary sounds. To solve these problems, the authors used a clinical manikin to record heart and lung sounds, including separate and mixed recordings, through a digital stethoscope. These recordings cover a variety of normal and abnormal cardiopulmonary sound types and were collected at different anatomical locations. In addition, all recordings were carried out in a controlled, noise - free environment, ensuring the high - quality and reliability of the data. ### Specific contributions: - **Diversity**: Provide recordings of 50 heart sounds, 50 lung sounds, and 110 mixed sounds, covering a variety of normal and abnormal cardiopulmonary sound types. - **High - quality**: Recorded using an advanced digital stethoscope with a high sampling rate, active noise reduction, and built - in frequency filtering functions to ensure recording quality. - **Mixed recordings**: For the first time, simultaneously recorded cardiopulmonary mixed sounds are provided, which are very valuable for studying naturally overlapping cardiopulmonary sounds. - **Wide application**: Applicable to a variety of machine - learning tasks such as supervised learning (such as classification algorithms) and unsupervised learning (such as clustering and blind source separation). Through this dataset, researchers can better develop and validate AI algorithms for cardiopulmonary disease detection and audio - signal processing.