Feature Screening Via Bergsma–Dassios Sign Correlation Learning
Daojiang He,Xinxin Hao,Kai Xu,Lei He,Youxin Liu
DOI: https://doi.org/10.4310/20-sii662
2021-01-01
Statistics and Its Interface
Abstract:Robust rank correlation screening (RRCS) procedure that is built on Kendall T, has been suggested by Li, Peng, Zhang and Zhu (2012) as a robust alternative to the sure independence screening (SIS) method that is based on the Pearson's correlation. However, as a drawback for certain applications is that T may be zero even if there is an association between two random variables, RRCS is not omnibus, only having an ability to detect monotonic effects. In this paper, we use the Bergsma-Dassios sign correlation (Bergsma and Dassios, 2014 tau(b)*) to introduce a new SIS procedure. We advocate using the T-b*-SIS for three reasons. First, as tau(b)* possesses the necessary and intuitive properties as a correlation index, the tau(b)*-SIS has a better screening ability for nonlinear effects including interactions and heterogeneity compared with the RRCS. Second, as tau(b)*, is a natural extension of tau, the tau(b)*-SIS is conceptually simple, easy to implement and robust to the presence of extreme values and outliers in the observations. Third, without assuming any moment condition on the response and predictors, the tau(b)*-SIS enjoys several appealing properties, such as the sure screening property, ranking consistency property and the characteristic of minimum model size. We demonstrate the merits of the tau(b)*-SIS procedure through extensive Monte Carlo experiments and illustrate the method through a real-data example.