Abstract:Block Principal Component Analysis (Block PCA) of a data matrix A, where loadings Z are determined by maximization of AZ 2 over unit norm orthogonal loadings, is difficult to use for the design of sparse PCA by 1 regularization, due to the difficulty of taking care of both the orthogonality constraint on loadings and the non differentiable 1 penalty. Our objective in this paper is to relax the orthogonality constraint on loadings by introducing new objective functions expvar(Y) which measure the part of the variance of the data matrix A explained by correlated components Y = AZ. So we propose first a comprehensive study of mathematical and numerical properties of expvar(Y) for two existing definitions Zou et al. [2006], Shen and Huang [2008] and four new definitions. Then we show that only two of these explained variance are fit to use as objective function in block PCA formulations for A rid of orthogonality constraints.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to define and calculate the explained variance in principal component analysis (PCA) when the vectors in the loading matrix do not satisfy the orthogonality constraint. Specifically, the traditional PCA method requires that the loading vectors are orthogonal, which makes it difficult to use sparsification techniques such as L1 - regularization. Therefore, this paper aims to introduce a new objective function expvar(Y) to measure the variance part in the data matrix A explained by the relevant component Y = AZ, thereby relaxing the orthogonality constraint on the loading matrix Z. ### Specific Problem Description 1. **Limitations of Traditional PCA**: - In traditional PCA, the column vectors of the loading matrix Z are orthogonal, so it can be solved by maximizing \(\|AZ\|_F^2\). - However, when introducing sparsity (such as through L1 - regularization), the loading vectors no longer remain orthogonal, resulting in difficulty in evaluating the explained variance. 2. **Deficiencies of Existing Definitions**: - Existing literature has proposed two definitions of explained variance: the adjusted variance by Zou et al. (2006) and the total variance by Shen and Huang (2008). - These definitions fail to meet the expected conditions in some cases. For example, in the case of non - orthogonal components, the explained variance may be over - estimated or under - estimated. 3. **Research Objectives**: - Introduce new definitions of explained variance and study the mathematical and numerical properties of these definitions. - Find the definitions of explained variance suitable for use as the objective function of block PCA (Block PCA), especially those that can be optimized without imposing orthogonality constraints. ### Solutions The author proposes several new definitions of explained variance and conducts detailed research on them: - **Subspace Explained Variance**: \[ \text{expvar}_{\text{subsp}}(Y)=\|AP_Z\|_F^2 = \text{tr}(Y^T Y (Z^T Z)^{-1}) \] where \(P_Z = Z(Z^T Z)^{-1}Z^T\) is the projection matrix. - **Normalized Explained Variances**: - **QR Normalized Variance**: \[ \text{expvar}_{\text{QR - norm}}(Y)=\sum_{j = 1}^m\frac{1}{\|t_j\|^2} \] - **UP Normalized Variance**: \[ \text{expvar}_{\text{UP - norm}}(Y)=\sum_{j = 1}^m\frac{1}{\|t_j\|^2} \] - **Projected Explained Variances**: - **QR Projected Explained Variance**: \[ \text{expvar}_{\text{QR - proj}}(Y)=\sum_{j = 1}^m\langle y_j,x_j\rangle^2 \] - **UP Projected Explained Variance**: \[ \text{expvar}_{\text{UP - proj}}(Y)=\sum_{j = 1}^m\langle y_j,x_j\rangle^2 \] - **Optimal Projected Explained Variance**: \[ \text{ex

From explained variance of correlated components to PCA without orthogonality constraints

Robust Principal Component Analysis Based on Maximum Correntropy Criterion

Optimal Projected Variance Group-Sparse Block PCA

Structured Principal Component Analysis Model With Variable Correlation Constraint

An augmented Lagrangian approach for sparse principal component analysis

Sparse principal component analysis via regularized low rank matrix approximation

Sparse principal component analysis by choice of norm

Sparse Principal Component Analysis

Sparse PCA With Multiple Components

Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation

High-Dimensional PCA Revisited: Insights from General Spiked Models and Data Normalization Effects

Maximally Correlated Principal Component Analysis

On preserving original variables in Bayesian PCA with application to image analysis.

Sparse PCA: a Geometric Approach

Sparse Principal Component Analysis Via Rotation and Truncation

A note on the variance in principal component regression

Generalized probabilistic principal component analysis of correlated data

Projected principal component analysis in factor models

Bayesian Robust Principal Component Analysis with Adaptive Singular Value Penalty

Robust Sparse Principal Component Analysis

Robust principal component analysis: A factorization-based approach with linear complexity