Relative Synergy Coefficient: A Novel Way to Detect Variable Interaction in Large Dataset

Yanrui Li,Kaiyou Fu,Yuchen Zhao,Chunjie Yang
DOI: https://doi.org/10.1016/j.knosys.2023.111112
IF: 8.139
2023-01-01
Knowledge-Based Systems
Abstract:Feature interaction, also referred to as feature synergy, denotes the phenomenon wherein interactive features collectively convey more information than their individual contributions, thereby holding paramount significance in the realms of feature engineering and data mining. Many prevailing techniques designed to detect these interactions primarily rely on model-based methods to compute absolute synergy. However, such approaches often prove ill-suited for extensive datasets and overlook variables with comparatively minor primary effects. In response, we introduce a groundbreaking metric known as the Relative Synergy Coefficient (RSC). This novel metric facilitates swift identification and quantification of relative synergy’s potency within large datasets. The proposed indicator is a non-parametric metric based on information entropy, and its computation involves the utilization of discretization and normalization techniques. The generality, equitability and robustness of metric is proved on the simulated data. Besides, the indicator is proved to be effective and can cross-validate with domain knowledge on two real world datasets.
What problem does this paper attempt to address?