Practical Markov Boundary Learning Without Strong Assumptions.

Xingyu Wu,Bingbing Jiang,Tianhao Wu,Huanhuan Chen
DOI: https://doi.org/10.1609/aaai.v37i9.26236
2023-01-01
Proceedings of the AAAI Conference on Artificial Intelligence
Abstract:Theoretically, the Markov boundary (MB) is the optimal solution for feature selection. However, existing MB learning algorithms often fail to identify some critical features in real-world feature selection tasks, mainly because the strict assumptions of existing algorithms, on either data distribution, variable types, or correctness of criteria, cannot be satisfied in application scenarios. This paper takes further steps toward opening the door to real-world applications for MB. We contribute in particular to a practical MB learning strategy, which can maintain feasibility and effectiveness in real-world data where variables can be numerical or categorical with linear or nonlinear, pairwise or multivariate relationships. Specifically, the equivalence between MB and the minimal conditional covariance operator (CCO) is investigated, which inspires us to design the objective function based on the predictability evaluation of the mapping variables in a reproducing kernel Hilbert space. Based on this, a kernel MB learning algorithm is proposed, where nonlinear multivariate dependence could be considered without extra requirements on data distribution and variable types. Extensive experiments demonstrate the efficacy of these contributions.
What problem does this paper attempt to address?