DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era

David Restrepo,Chenwei Wu,Constanza Vásquez-Venegas,Luis Filipe Nakayama,Leo Anthony Celi,Diego M López
2024-06-03
Abstract:In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating embeddings and the Cross-Industry Standard Process for Data Mining with the existing Data Fusion Information Group model. Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability. We also propose "disentangled dense fusion", a novel embedding fusion method designed to optimize mutual information and facilitate dense inter-modality feature interaction, thereby minimizing redundant information. We demonstrate the model's efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes. The model achieved a Macro F1 score of 0.92 in diabetic retinopathy prediction, an R-squared of 0.854 and sMAPE of 24.868 in domestic violence prediction, and a macro AUC of 0.92 and 0.99 for disease prediction and sex classification, respectively, in radiological analysis. These results underscore the Data Fusion for Data Mining model's potential to significantly impact multimodal data processing, promoting its adoption in diverse, resource-constrained settings.
Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the challenges of multimodal data fusion in the era of artificial intelligence, particularly in complex domains such as healthcare. Specifically, the study proposes a new process model—the Data Fusion Model for Multimodal Data Mining (DF-DM), which integrates embedding techniques and the Cross-Industry Standard Process for Data Mining (CRISP-DM) with the existing Data Fusion Information Group (DFIG) model. The goal of the DF-DM model is to reduce computational costs, complexity, and bias while improving efficiency and reliability. The paper also introduces a new method called "Disentangled Dense Fusion," an embedding fusion method designed to optimize mutual information and promote dense feature interaction between modalities to minimize redundant information. The paper demonstrates the effectiveness of the model through three case studies: 1. Predicting diabetic retinopathy using retinal images and patient metadata; 2. Predicting domestic violence using satellite images, internet data, and census data; 3. Identifying clinical and demographic features through radiographic images and clinical notes. The results of these case studies indicate that the DF-DM model has significant potential in multimodal data processing and promotes its adoption in resource-constrained environments. In summary, the main contributions of the paper include: - Proposing a new process model for multimodal data fusion; - Introducing the method of Disentangled Dense Fusion; - Validating the model's effectiveness and flexibility through three specific healthcare application scenarios.