Dimension Reduction for Data with Heterogeneous Missingness

Yurong Ling,Zijing Liu,Jing-Hao Xue
DOI: https://doi.org/10.48550/arXiv.2109.11765
IF: 5.414
2021-09-24
Machine Learning
Abstract:Dimension reduction plays a pivotal role in analysing high-dimensional data. However, observations with missing values present serious difficulties in directly applying standard dimension reduction techniques. As a large number of dimension reduction approaches are based on the Gram matrix, we first investigate the effects of missingness on dimension reduction by studying the statistical properties of the Gram matrix with or without missingness, and then we present a bias-corrected Gram matrix with nice statistical properties under heterogeneous missingness. Extensive empirical results, on both simulated and publicly available real datasets, show that the proposed unbiased Gram matrix can significantly improve a broad spectrum of representative dimension reduction approaches.
What problem does this paper attempt to address?