Sahil Sidheekh,Pranuthi Tenali,Saurabh Mathur,Erik Blasch,Kristian Kersting,Sriraam Natarajan
Abstract:We consider the problem of late multi-modal fusion for discriminative learning. Motivated by noisy, multi-source domains that require understanding the reliability of each data source, we explore the notion of credibility in the context of multi-modal fusion. We propose a combination function that uses probabilistic circuits (PCs) to combine predictive distributions over individual modalities. We also define a probabilistic measure to evaluate the credibility of each modality via inference queries over the PC. Our experimental evaluation demonstrates that our fusion method can reliably infer credibility while maintaining competitive performance with the state-of-the-art.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in multimodal fusion, how to effectively combine information from different data sources and evaluate the credibility of each data source. Specifically, the author focuses on how to perform late - stage multimodal fusion through Probabilistic Circuits (PCs) in discriminative learning tasks to improve the model's ability to evaluate the reliability of different modal information.
### Problem Background
When making decisions using multiple data sources (such as images and blood test results) in high - risk tasks in the real world (such as healthcare), reliable learning and reasoning are required. However, the raw data from different sources is often noisy, incomplete, and inconsistent, which poses challenges to the fusion and analysis of multimodal data. Existing multimodal fusion methods usually assume that all data sources have the same credibility, which may lead to sub - optimal performance or wrong conclusions.
### Research Motivation
The author points out that in many applications (such as sensor fusion, medical diagnosis, and financial analysis), the quality and reliability of information sources vary greatly, and distinguishing between reliable and unreliable information sources is crucial for making accurate and informed decisions. Therefore, explicitly modeling the credibility of information sources is a key issue.
### Solution
To solve this problem, the author proposes a late - stage multimodal fusion method based on Probabilistic Circuits (PCs). The main contributions of this method include:
1. **Introducing the first multimodal fusion method with strong probabilistic semantics**, which is based on Probabilistic Circuits and defines a class of Probabilistic Circuits suitable for credibility - aware multimodal fusion.
2. **Proposing two different versions of the late - stage fusion algorithm**, each with different characteristics.
3. **Deriving a theoretically - based credibility measure** and showing its connection with the conditional entropy of the unimodal prediction distribution, thereby achieving reliable late - stage fusion.
4. **Experimentally verifying the effectiveness of Probabilistic Circuits in modeling complex interactions between modalities and faithfully estimating their credibility**.
### Method Overview
The author formalizes the multimodal fusion problem as a probabilistic inference problem and uses Probabilistic Circuits to model the joint distribution. Through the traceability of Probabilistic Circuits, the author defines a probability measure for evaluating the credibility of each modality. Specifically, the credibility \(C_j\) of modality \(j\) when predicting the target \(Y\) is defined as the divergence between the conditional distributions including and excluding modality \(j\):
\[
C_j=\delta(P(Y|\{F_{\phi_i}\}_{i = 1}^M)\|P(Y|\{F_{\phi_i}\}_{i = 1}^M\setminus\{F_{\phi_j}\}))
\]
where \(\delta\) is a divergence measure, such as KL - divergence.
For the sake of comparison, the relative credibility score \(\tilde{C}_j\) is defined:
\[
\tilde{C}_j=\frac{C_j}{\sum_jC_j}
\]
In addition, the author also proves that under certain structural conditions, Probabilistic Circuits are Marginal Dominant, which makes them suitable for credibility - aware fusion.
### Experimental Verification
The author conducted experiments on four different multimodal datasets to verify the effectiveness of the proposed method. The experimental results show that the method based on Probabilistic Circuits performs excellently on multiple evaluation metrics, especially showing higher robustness when dealing with noisy data.
In summary, this paper aims to solve the credibility evaluation problem in multimodal fusion by introducing Probabilistic Circuits, providing a theoretically - based and effective solution.