Abstract:We propose to quantify dependence between two systems $X$ and $Y$ in a dataset $D$ based on the Bayesian comparison of two models: one, $H_0$, of statistical independence and another one, $H_1$, of dependence. In this framework, dependence between $X$ and $Y$ in $D$, denoted $B(X,Y|D)$, is quantified as $P(H_1|D)$, the posterior probability for the model of dependence given $D$, or any strictly increasing function thereof. It is therefore a measure of the evidence for dependence between $X$ and $Y$ as modeled by $H_1$ and observed in $D$. We review several statistical models and reconsider standard results in the light of $B(X,Y|D)$ as a measure of dependence. Using simulations, we focus on two specific issues: the effect of noise and the behavior of $B(X,Y|D)$ when $H_1$ has a parameter coding for the intensity of dependence. We then derive some general properties of $B(X,Y|D)$, showing that it quantifies the information contained in $D$ in favor of $H_1$ versus $H_0$. While some of these properties are typical of what is expected from a valid measure of dependence, others are novel and naturally appear as desired features for specific measures of dependence, which we call inferential. We finally put these results in perspective; in particular, we discuss the consequences of using the Bayesian framework as well as the similarities and differences between $B(X,Y|D)$ and mutual information.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to quantify the dependence relationship between two systems $X$ and $Y$. Specifically, the author proposes a method based on Bayesian model comparison to measure the dependence between the two systems in the data set. The following is a detailed description of this problem: ### Research Background and Problem In scientific research, independence and dependence are key concepts used to characterize the structural relationships between systems. Suppose we have two systems $X$ and $Y$, which are described by random variables $X$ and $Y$ respectively, and have known joint probability distributions $f_{XY}(x, y)$ and marginal distributions $f_X(x)$ and $f_Y(y)$. In this case, if $X$ and $Y$ are independent, the following condition is satisfied: \[ f_{XY}(x, y) = f_X(x) f_Y(y) \] When $X$ and $Y$ are not independent, $f_{XY}(x, y)$ is different from $f_X(x) f_Y(y)$, and at this time, it is said that there is a dependence relationship between $X$ and $Y$. Quantifying this dependence relationship is an important problem, especially how to measure the degree of difference between $f_{XY}(x, y)$ and $f_X(x) f_Y(y)$. ### Proposed Method To solve this problem, the author proposes a method of quantifying dependence using Bayesian model comparison. Specifically, they define two models: - $H_0$: Represents that $X$ and $Y$ are statistically independent. - $H_1$: Represents that $X$ and $Y$ are dependent. The dependence $B(X, Y | D)$ is measured by the posterior probability $P(H_1 | D)$, that is, the posterior probability of the $H_1$ model given the data set $D$. Therefore, $B(X, Y | D)$ can be regarded as an evidence measure of the dependence of $X$ and $Y$. ### Formula Expression According to Bayes' formula, the posterior probability can be expressed as: \[ P(H_i | D) = \frac{P(H_i) P(D | H_i)}{P(D)}, \quad i = 0, 1 \] where: - $P(H_i)$ is the prior probability of the model. - $P(D | H_i)$ is the marginal likelihood function, which can be obtained by integrating the parameter prior and the likelihood function: \[ P(D | H_i) = \int_{\theta(i) \in \Theta(i)} p(\theta(i) | H_i) p(D | H_i, \theta(i)) d\theta(i) \] ### Main Contributions The author explores two specific problems through simulation studies: 1. **The Influence of Noise**: Analyze the influence of noise on the dependence measure. 2. **The Behavior of the Dependence - strength Parameter**: Study the influence of the parameter encoding the dependence strength in $H_1$ on $B(X, Y | D)$. In addition, the author also derives some general properties about $B(X, Y | D)$, indicating that it quantifies the amount of information in the data $D$ that supports $H_1$ rather than $H_0$. ### Practical Applications Finally, the author verifies the effectiveness of this method through practical applications in neuroscience and neuroimaging. For example, in electroencephalogram (EEG) data, this method can be used to quantify the dependence relationships between different brain regions, especially in event - related protocols, to evaluate the consistency of the brain's response to external stimuli. In summary, the main purpose of this paper is to provide a dependence - measure method based on Bayesian model comparison, and verify its effectiveness and applicability through theoretical analysis and experimental verification.

An inferential measure of dependence between two systems using Bayesian model comparison

A nonparametric Bayesian test of dependence

Measuring Dependence between Events

Quantifying and estimating dependence via sensitivity of conditional distributions

On Quantifying Dependence: A Framework for Developing Interpretable Measures

Towards a universal representation of statistical dependence

A Wasserstein index of dependence for random measures

A new Bayesian discrepancy measure

Measuring natural source dependence

A low variance consistent test of relative dependency

Deprank: A probabilistic measure of dependence via heterogeneous links

Rearranged dependence measures

Quantitative comparisons between finitary posterior distributions and Bayesian posterior distributions

Bayesian model comparison with the Hyvärinen score: computation and consistency

Normalized Latent Measure Factor Models

Bayesian Inference for Comparative Research

A Framework to Adjust Dependency Measure Estimates for Chance

Modeling dependent gene expression

Different coefficients for studying dependence

Bayesian Importance of Features (BIF)

Multivariate dependence and genetic networks inference