High-dimensional Factor Model and Its Applications to Statistical Machine Learning
Zhao Chen,Jianqing Fan,Christina Dan Wang
DOI: https://doi.org/10.1360/ssm-2020-0041
2020-01-01
Scientia Sinica Mathematica
Abstract:This paper reviews the recent developments on factor model and its applications to statistical machine learning. The factor model reduces the dimensionality of variables, and provides a low-rank plus sparse structure for the high-dimensional covariance matrices. Therefore, it attracts much attention in high-dimensional data analysis, and has been widely applied in many fields of sciences, engineering, humanities and social sciences, including economics, finance, genomics, neuroscience, machine learning, and so on. We elaborate how to use principal component analysis method to extract latent factors, estimate their associated factor loadings, idiosyncratic components, and their associated covariance matrices. These methods have been proven to effectively cope with the challenges of big data, such as high dimensionality, strong dependence, heavy-tailed variables, and heterogeneity. In addition, we also focus on the role of the factor model in dealing with high-dimensional statistical learning problems such as covariance matrix estimation, model selection, multiple testing, and prediction. Finally, we illustrate the innate relationships between factor models and modern machine learning problems through several applications, including network analysis, matrix completion, ranking, and mixture models.