Abstract:Causal analysis has become an essential component in understanding the underlying causes of phenomena across various fields. Despite its significance, existing literature on causal discovery algorithms is fragmented, with inconsistent methodologies, i.e., there is no universal classification standard for existing methods, and a lack of comprehensive evaluations, i.e., data characteristics are often ignored to be jointly analyzed when benchmarking algorithms. This study addresses these gaps by conducting an exhaustive review and empirical evaluation for causal discovery methods on numerical data, aiming to provide a clearer and more structured understanding of the field. Our research begins with a comprehensive literature review spanning over two decades, analyzing over 200 academic articles and identifying more than 40 representative algorithms. This extensive analysis leads to the development of a structured taxonomy tailored to the complexities of causal discovery, categorizing methods into six main types. To address the lack of comprehensive evaluations, our study conducts an extensive empirical assessment of 29 causal discovery algorithms on multiple synthetic and real-world datasets. We categorize synthetic datasets based on size, linearity, and noise distribution, employing five evaluation metrics, and summarize the top-3 algorithm recommendations, providing guidelines for users in various data scenarios. Our results highlight a significant impact of dataset characteristics on algorithm performance. Moreover, a metadata extraction strategy with an accuracy exceeding 80% is developed to assist users in algorithm selection on unknown datasets. Based on these insights, we offer professional and practical guidelines to help users choose the most suitable causal discovery methods for their specific dataset.

A Unified View of Causal and Non-causal Feature Selection

Multi-Source Causal Feature Selection

Causality-based Feature Selection: Methods and Evaluations

Causality-based Feature Selection

Data Fusion Using Feature Selection Based Causal Network Algorithm

A Light Causal Feature Selection Approach to High-Dimensional Data

Causal Feature Selection With Dual Correction

Multi-Label Causal Feature Selection

Causally-Aware Unsupervised Feature Selection Learning

Accurate Markov Boundary Discovery for Causal Feature Selection.

Fair Feature Selection: A Causal Perspective

Fair Causal Feature Selection

Bivariate Causal Discovery using Bayesian Model Selection

Causal Feature Selection via Transfer Entropy

Multilabel Feature Selection: A Local Causal Structure Learning Approach

Comprehensive Review and Empirical Evaluation of Causal Discovery Algorithms for Numerical Data

Statistical Approaches for Causal Inference

Learning Causal Bayesian Networks Based on Causality Analysis for Classification

Nonlinear Causal Discovery in Time Series

Unified Model Selection Approach Based on Minimum Description Length Principle in Granger Causality Analysis.

A survey of causal discovery based on functional causal model