Software application profile: tpc and micd-R packages for causal discovery with incomplete cohort data

Ryan M Andrews,Christine W Bang,Vanessa Didelez,Janine Witte,Ronja Foraita
DOI: https://doi.org/10.1093/ije/dyae113
2024-08-14
Abstract:Motivation: The Peter Clark (PC) algorithm is a popular causal discovery method to learn causal graphs in a data-driven way. Until recently, existing PC algorithm implementations in R had important limitations regarding missing values, temporal structure or mixed measurement scales (categorical/continuous), which are all common features of cohort data. The new R packages presented here, micd and tpc, fill these gaps. Implementation: micd and tpc packages are R packages. General features: The micd package provides add-on functionality for dealing with missing values to the existing pcalg R package, including methods for multiple imputations relying on the Missing At Random assumption. Also, micd allows for mixed measurement scales assuming conditional Gaussianity. The tpc package efficiently exploits temporal information in a way that results in a more informative output that is less prone to statistical errors. Availability: The tpc and micd packages are freely available on the Comprehensive R Archive Network (CRAN). Their source code is also available on GitHub (https://github.com/bips-hb/micd; https://github.com/bips-hb/tpc).
What problem does this paper attempt to address?