HyMNet: a Multimodal Deep Learning System for Hypertension Classification using Fundus Photographs and Cardiometabolic Risk Factors

Mohammed Baharoon,Hessa Almatar,Reema Alduhayan,Tariq Aldebasi,Badr Alahmadi,Yahya Bokhari,Mohammed Alawad,Ahmed Almazroa,Abdulrhman Aljouie
2024-03-24
Abstract:In recent years, deep learning has shown promise in predicting hypertension (HTN) from fundus images. However, most prior research has primarily focused on analyzing a single type of data, which may not capture the full complexity of HTN risk. To address this limitation, this study introduces a multimodal deep learning (MMDL) system, dubbed HyMNet, which combines fundus images and cardiometabolic risk factors, specifically age and gender, to improve hypertension detection capabilities. Our MMDL system uses RETFound, a foundation model pre-trained on 1.6 million retinal images, for the fundus path and a fully connected neural network for the age and gender path. The two paths are jointly trained by concatenating the feature vectors from each path that are then fed into a fusion network. The system was trained on 5,016 retinal images from 1,243 individuals collected from the Saudi Ministry of National Guard Health Affairs. The results show that the multimodal model that integrates fundus images along with age and gender outperforms the unimodal system trained solely on fundus photographs, with an F1 score of 0.771 [0.747, 0.796], and 0.745 [0.719, 0.772] for hypertension detection, respectively. Additionally, we studied the effect underlying diabetes mellitus has on the model's predictive ability, concluding that diabetes is used as a confounding variable for distinguishing hypertensive cases. Our code and model weights are publicly available at <a class="link-external link-https" href="https://github.com/MohammedSB/HyMNet" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the issue of early detection and identification of hypertension (HTN), particularly by enhancing detection capabilities using fundus photographs and cardiovascular metabolic risk factors such as age and gender. The research team developed a multimodal deep learning system named HyMNet, which combines fundus image data with age and gender information to improve the diagnostic capability for hypertension. Specifically, the HyMNet system employs the following methods: 1. **Model Architecture**: The HyMNet system includes a fundus path (FundusPath) based on the pre-trained model RETFound for processing fundus images, and a demographic path (DemographicPath) consisting of a fully connected neural network for processing age and gender information. The feature vectors from these two paths are merged and input into a fusion network (FusionPath) for the final hypertension prediction. 2. **Dataset**: The study utilized 5,016 fundus photographs from the Saudi National Health System, covering data from 1,243 individuals. These images were divided into training, validation, and test sets. 3. **Comparative Experiments**: The paper also compared the performance of single-modal systems (using only fundus images or demographic features) with the multimodal system. The results showed that the HyMNet system, which includes both fundus images and demographic features, performed best across various evaluation metrics, particularly achieving an F1 score of 0.771 [0.747, 0.796], outperforming models using only fundus images (F1 score of 0.745 [0.719, 0.772]) or only demographic features (F1 score of 0.752 [0.727, 0.778]). 4. **Impact of Diabetes**: The study further analyzed the impact of diabetes as a potential confounding factor on hypertension prediction capability. The results indicated that the system's predictive performance significantly improved among diabetic patients, suggesting that diabetes might be an important confounding variable. In summary, the paper effectively enhances the early detection capability of hypertension based on fundus images and demographic features by introducing a novel multimodal deep learning framework. This approach provides strong support for hypertension screening and early intervention.