Exploratory Analysis of Big Social Data Using MIC/MINE Statistics

Piyawat Lertvittayakumjorn,Chao Wu,Yue Liu,Hong Mi,Yike Guo
DOI: https://doi.org/10.1007/978-3-319-67256-4_41
2017-01-01
Abstract:A major goal of Exploratory Data Analysis (EDA) is to understand main characteristics of a dataset, especially relationships between variables, which are helpful for creating a predictive model and analysing causality in social science research. This paper aims to introduce Maximal Information Coefficient (MIC) and its by-product statistics to social science researchers as effective EDA tools for big social data. A case study was conducted using a historical data of more than 3,000 country-level indicators. As a result, MIC and some by-product statistics successfully provided useful information for EDA complementing the traditional Pearson’s correlation. Moreover, they revealed several significant, including nonlinear, relationships between variables which are intriguing and able to suggest further research in social sciences.
What problem does this paper attempt to address?