From explained variance of correlated components to PCA without orthogonality constraints

Marie Chavent,Guy Chavent
2024-02-07
Abstract:Block Principal Component Analysis (Block PCA) of a data matrix A, where loadings Z are determined by maximization of AZ 2 over unit norm orthogonal loadings, is difficult to use for the design of sparse PCA by 1 regularization, due to the difficulty of taking care of both the orthogonality constraint on loadings and the non differentiable 1 penalty. Our objective in this paper is to relax the orthogonality constraint on loadings by introducing new objective functions expvar(Y) which measure the part of the variance of the data matrix A explained by correlated components Y = AZ. So we propose first a comprehensive study of mathematical and numerical properties of expvar(Y) for two existing definitions Zou et al. [2006], Shen and Huang [2008] and four new definitions. Then we show that only two of these explained variance are fit to use as objective function in block PCA formulations for A rid of orthogonality constraints.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to define and calculate the explained variance in principal component analysis (PCA) when the vectors in the loading matrix do not satisfy the orthogonality constraint. Specifically, the traditional PCA method requires that the loading vectors are orthogonal, which makes it difficult to use sparsification techniques such as L1 - regularization. Therefore, this paper aims to introduce a new objective function expvar(Y) to measure the variance part in the data matrix A explained by the relevant component Y = AZ, thereby relaxing the orthogonality constraint on the loading matrix Z. ### Specific Problem Description 1. **Limitations of Traditional PCA**: - In traditional PCA, the column vectors of the loading matrix Z are orthogonal, so it can be solved by maximizing \(\|AZ\|_F^2\). - However, when introducing sparsity (such as through L1 - regularization), the loading vectors no longer remain orthogonal, resulting in difficulty in evaluating the explained variance. 2. **Deficiencies of Existing Definitions**: - Existing literature has proposed two definitions of explained variance: the adjusted variance by Zou et al. (2006) and the total variance by Shen and Huang (2008). - These definitions fail to meet the expected conditions in some cases. For example, in the case of non - orthogonal components, the explained variance may be over - estimated or under - estimated. 3. **Research Objectives**: - Introduce new definitions of explained variance and study the mathematical and numerical properties of these definitions. - Find the definitions of explained variance suitable for use as the objective function of block PCA (Block PCA), especially those that can be optimized without imposing orthogonality constraints. ### Solutions The author proposes several new definitions of explained variance and conducts detailed research on them: - **Subspace Explained Variance**: \[ \text{expvar}_{\text{subsp}}(Y)=\|AP_Z\|_F^2 = \text{tr}(Y^T Y (Z^T Z)^{-1}) \] where \(P_Z = Z(Z^T Z)^{-1}Z^T\) is the projection matrix. - **Normalized Explained Variances**: - **QR Normalized Variance**: \[ \text{expvar}_{\text{QR - norm}}(Y)=\sum_{j = 1}^m\frac{1}{\|t_j\|^2} \] - **UP Normalized Variance**: \[ \text{expvar}_{\text{UP - norm}}(Y)=\sum_{j = 1}^m\frac{1}{\|t_j\|^2} \] - **Projected Explained Variances**: - **QR Projected Explained Variance**: \[ \text{expvar}_{\text{QR - proj}}(Y)=\sum_{j = 1}^m\langle y_j,x_j\rangle^2 \] - **UP Projected Explained Variance**: \[ \text{expvar}_{\text{UP - proj}}(Y)=\sum_{j = 1}^m\langle y_j,x_j\rangle^2 \] - **Optimal Projected Explained Variance**: \[ \text{ex