Generalized Principal Component Analysis for Large-dimensional Matrix Factor Model

Yong He,Yujie Hou,Haixia Liu,Yalin Wang
2024-11-10
Abstract:Matrix factor models have been growing popular dimension reduction tools for large-dimensional matrix time series. However, the heteroscedasticity of the idiosyncratic components has barely received any attention. Starting from the pseudo likelihood function, this paper introduces a Generalized Principal Component Analysis (GPCA) method for matrix factor model which takes the heteroscedasticity into account. Theoretically, we first derive the asymptotic distribution of the GPCA estimators by assuming the separable covariance matrices are known in advance. We then propose adaptive thresholding estimators for the separable covariance matrices and show that this would not alter the asymptotic distribution of the GPCA estimators under certain regular sparsity conditions in the high-dimensional covariance matrix estimation literature. The GPCA estimators are shown to be more efficient than the state-of-the-art methods under certain heteroscedasticity conditions. Thorough numerical studies are conducted to demonstrate the superiority of our method over the existing approaches. Analysis of a financial portfolio dataset illustrates the empirical usefulness of the proposed method.
Statistics Theory
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the deficiency of existing matrix factor model methods in dealing with idiosyncratic components with heteroscedasticity. Specifically, existing methods usually assume that these idiosyncratic components are homoscedastic, while in fact they may exhibit significant heteroscedasticity. This heteroscedasticity will cause the ordinary least squares (OLS) estimator to no longer be the best linear unbiased estimator (BLUE), thereby reducing the accuracy of parameter estimation. To solve this problem, the author proposes the Generalized Principal Component Analysis (GPCA) method. The GPCA method improves the efficiency and accuracy of estimation by introducing a pseudo - likelihood function and considering the heteroscedasticity of idiosyncratic components. In addition, the author also derives the asymptotic distribution of the GPCA estimator and proves that under certain heteroscedastic conditions, the GPCA method is more effective than existing methods. ### Main contributions 1. **Propose the Oracle GPCA method for the first time**: This method can handle the heteroscedasticity problem in the matrix factor model. 2. **Theoretical derivation**: Derive the asymptotic distributions of the loading matrix, factors, and common components under the Oracle GPCA method. 3. **Adaptive threshold estimation**: Propose an adaptive threshold method for estimating the unknown separation covariance matrices \( U \) and \( V \), and prove that this method does not change the asymptotic distribution of the GPCA estimator. ### Model setting The matrix factor model discussed in the paper can be expressed as: \[ X_t = R F_t C^\top + E_t \] where: - \( X_t \in \mathbb{R}^{p_1 \times p_2} \) is the observation matrix, - \( R \in \mathbb{R}^{p_1 \times k_1} \) is the row - factor loading matrix, - \( C \in \mathbb{R}^{p_2 \times k_2} \) is the column - factor loading matrix, - \( F_t \in \mathbb{R}^{k_1 \times k_2} \) is the common factor matrix, - \( E_t \in \mathbb{R}^{p_1 \times p_2} \) is the idiosyncratic component. To deal with heteroscedasticity, it is assumed that the covariance matrix of the idiosyncratic component has a separable structure \( V \otimes U \), where \( U \in \mathbb{R}^{p_1 \times p_1} \) and \( V \in \mathbb{R}^{p_2 \times p_2} \) capture the dependencies in the row and column directions respectively. ### Method overview The GPCA method estimates the factors and loading matrices by maximizing the pseudo - likelihood function: \[ 2L(R, F_t, C) = -\left( p_1 p_2 \ln(2\pi) + p_1 \ln|V| + p_2 \ln|U| + \text{Tr}\left[V^{-1}(X_t - R F_t C^\top)^\top U^{-1}(X_t - R F_t C^\top)\right] \right) \] When \( U \) and \( V \) are unknown, the author proposes an adaptive threshold estimation method to estimate these covariance matrices and proves the effectiveness of this method. ### Experimental verification Through numerical experiments and the analysis of actual financial data sets, the author shows the superiority of the GPCA method in dealing with heteroscedasticity and proves its effectiveness in practical applications.