Abstract:Unveil, model, and comprehend the causal mechanisms underpinning natural phenomena stand as fundamental endeavors across myriad scientific disciplines. Meanwhile, new knowledge emerges when discovering causal relationships from data. Existing causal learning algorithms predominantly focus on the isolated effects of variables, overlook the intricate interplay of multiple variables and their collective behavioral patterns. Furthermore, the ubiquity of high-dimensional data exacts a substantial temporal cost for causal algorithms. In this paper, we develop a novel method called MgCSL (Multi-granularity Causal Structure Learning), which first leverages sparse auto-encoder to explore coarse-graining strategies and causal abstractions from micro-variables to macro-ones. MgCSL then takes multi-granularity variables as inputs to train multilayer perceptrons and to delve the causality between variables. To enhance the efficacy on high-dimensional data, MgCSL introduces a simplified acyclicity constraint to adeptly search the directed acyclic graph among variables. Experimental results show that MgCSL outperforms competitive baselines, and finds out explainable causal connections on fMRI datasets.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that the existing causal learning algorithms mainly focus on the isolated effects between single variables and ignore the complex interactions among multiple variables and their collective behavior patterns. In addition, the prevalence of high - dimensional data has significantly increased the time cost of causal learning algorithms. Therefore, this paper proposes a new method - Multi - Granularity Causal Structure Learning (MgCSL), which aims to explore coarse - grained strategies and causal abstractions from micro - variables to macro - variables, and train multi - granularity variables through Multi - Layer Perceptron (MLP) to deeply explore the causal relationships among variables. To improve the efficiency on high - dimensional data, MgCSL introduces a simplified acyclic constraint to efficiently search for Directed Acyclic Graphs (DAGs) among variables. Experimental results show that MgCSL performs excellently on multiple benchmarks and can discover interpretable causal connections from fMRI datasets. ### Specific Problem Description 1. **Limitations of Existing Causal Learning Algorithms**: - **Ignoring Complex Interactions**: Existing algorithms mainly focus on the isolated effects between single variables and overlook the complex interactions among multiple variables and their collective behavior patterns. - **Difficulty in Processing High - Dimensional Data**: The existence of high - dimensional data has significantly increased the time cost of causal learning algorithms, affecting their practical application effects. 2. **Research Objectives**: - **Multi - Granularity Causal Structure Learning**: Develop a method that can explore coarse - grained strategies and causal abstractions from micro - variables to macro - variables. - **Improve the Efficiency of High - Dimensional Data Processing**: Introduce a simplified acyclic constraint to efficiently search for Directed Acyclic Graphs (DAGs) among variables, thereby improving the processing efficiency on high - dimensional data. ### Solutions 1. **Multi - Granularity Causal Structure Learning (MgCSL)**: - **Sparse Auto - Encoder (SAE)**: Used to automatically coarsen micro - variables into potential macro - variables. - **Multi - Layer Perceptron (MLP)**: Build an MLP for each micro - variable, with inputs including micro - variables and macro - variables, to explore potential causal mechanisms. - **Simplified Acyclic Constraint**: Introduce a simplified acyclic constraint to efficiently search for Directed Acyclic Graphs (DAGs) among variables. 2. **Experimental Verification**: - **Synthetic Datasets**: Use Erdős - Rényi (ER) and Scale - Free (SF) schemes to generate random DAGs and test the performance under different numbers of variables (d ∈ {20, 50, 100}) and edge densities (degree = 2). - **Real Datasets**: Use the Sachs dataset to measure the causal relationships in human cells with different protein and phospholipid expression levels. ### Experimental Results - **Precision**: The precision of MgCSL on multi - granularity synthetic datasets is significantly higher than that of other baseline methods. - **Structural Hamming Distance (SHD)**: The SHD of MgCSL on multi - granularity synthetic datasets is significantly lower than that of other baseline methods. - **Runtime**: The runtime of MgCSL on multi - granularity synthetic datasets is significantly shorter than that of other baseline methods. ### Conclusion MgCSL effectively solves the limitations of existing causal learning algorithms by introducing multi - granularity causal structure learning and simplified acyclic constraints, especially performing well in high - dimensional data processing. Experimental results verify the superior performance of MgCSL in multi - granularity causal structure learning.

Multi-granularity Causal Structure Learning

Causal Structure Learning Supervised by Large Language Model

Towards Human-like Perception: Learning Structural Causal Model in Heterogeneous Graph

Gradient-Based Local Causal Structure Learning

Continuous Causal Structure Learning from Incremental Instances and Feature Spaces

CCSL: A Causal Structure Learning Method from Multiple Unknown Environments

Local Causal Structure Learning for Streaming Features

LeCaSiM: Learning Causal Structure via Inverse of M-Matrices with Adjustable Coefficients

A Hybrid Causal Structure Learning Algorithm for Mixed-Type Data

Causal learner: A toolbox for causal structure and Markov blanket learning

Causal Structure Learning With One-Dimensional Convolutional Neural Networks

Multi-modal Causal Structure Learning and Root Cause Analysis

On Learning Necessary and Sufficient Causal Graphs

Learning causal structures using hidden compact representation

Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine

Deep Causal Learning: Representation, Discovery and Inference

Causal Structure Learning Based on Genetic Algorithm and Markov Blanket

Learning Causal Abstractions of Linear Structural Causal Models

Learning Multiscale Non-stationary Causal Structures

Causal Structure Learning: a Combinatorial Perspective

Amortized Inference for Causal Structure Learning