Moving from Descriptive to Causal Analytics

Jack Schryver,Mallikarjun Shankar,Songhua Xu
DOI: https://doi.org/10.1145/2389707.2389709
2012-01-01
Abstract:The knowledge management community has introduced a multitude of methods for knowledge discovery on large datasets. In the context of public health intelligence, we integrated and incorporated some of these methods into an analyst's workflow that proceeds from the data-centric descriptive level of analysis to the model-centric causal level of reasoning. We show several case studies of the proposed analyst's workflow as applied to the US Health Indicators Warehouse (HIW), which is a medium scale, public dataset regarding community health information as collected by the US federal government. In our case studies, we demonstrate a series of visual analytics efforts targeted at the HIW, including visual analysis according to correlation matrices, multivariate outlier analysis, multiple linear regression of Medicare costs, confirmatory factor analysis, and hybrid scatterplot and heatmap visualization for distributions of a group of health indicators. We conclude by sketching a preliminary framework for examining causal dependence hypotheses for future data science research in public health.
What problem does this paper attempt to address?