Predicting Heart Activity from Speech using Data-driven and Knowledge-based features

Gasser Elbanna,Zohreh Mostaani,Mathew Magimai.-Doss
2024-06-10
Abstract:Accurately predicting heart activity and other biological signals is crucial for diagnosis and monitoring. Given that speech is an outcome of multiple physiological systems, a significant body of work studied the acoustic correlates of heart activity. Recently, self-supervised models have excelled in speech-related tasks compared to traditional acoustic methods. However, the robustness of data-driven representations in predicting heart activity remained unexplored. In this study, we demonstrate that self-supervised speech models outperform acoustic features in predicting heart activity parameters. We also emphasize the impact of individual variability on model generalizability. These findings underscore the value of data-driven representations in such tasks and the need for more speech-based physiological data to mitigate speaker-related challenges.
Sound,Artificial Intelligence,Audio and Speech Processing,Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **The effectiveness and feasibility of predicting cardiac activity parameters (such as heart rate and heart rate variability) through voice signals**. Specifically, the author aims to evaluate the performance of self - supervised learning models (SSMs) in predicting cardiac activity and compare them with traditional knowledge - based acoustic features. In addition, the study also explores the impact of inter - individual and intra - individual differences on the generalization ability of the model. ### Specific description of the problem: 1. **Limitations of existing methods**: - Although previous studies have shown that voice signals can reflect changes in cardiac activity, most studies rely on traditional acoustic features (such as spectral features), which perform poorly when generalizing across individuals. - Self - supervised learning models have performed well in voice - related tasks, but their application in predicting cardiac activity has not been fully explored. 2. **Research objectives**: - Evaluate the performance of self - supervised learning models (especially the Hybrid BYOL - S model) in predicting cardiac activity parameters (such as BPM and HRV). - Compare the performance of self - supervised learning models with traditional acoustic features. - Study the impact of inter - individual and intra - individual differences on the generalization ability of the model. - Explore the impact of different context window lengths on prediction performance. - Analyze which acoustic features are most important for predicting cardiac activity. ### Research background: - **Relationship between cardiac activity and voice**: Previous studies have shown that cardiac activity (such as heart rate and blood pressure) can affect voice features (such as fundamental frequency F0). Therefore, predicting cardiac activity by analyzing voice signals has potential application value. - **Advantages of self - supervised learning**: Self - supervised learning models can be pre - trained with a large amount of unlabeled data, thereby extracting more robust voice representations, which may help improve the accuracy of predicting cardiac activity. ### Main contributions: - **Evaluate for the first time the performance of self - supervised learning models in predicting cardiac activity**. - **Emphasize the impact of individual differences on the generalization ability of the model**, and point out that more voice physiological data are needed to alleviate speaker - related challenges. - **Propose that increasing the context window length can significantly improve prediction performance**. - **Identify the acoustic features that are most important for predicting cardiac activity**, providing directions for future research. Through these studies, the author hopes to provide new insights into predicting cardiac activity using voice signals and promote further development in this field.