Abstract:Linear latent variable models such as principal component analysis (PCA), independent component analysis (ICA), canonical correlation analysis (CCA), and factor analysis (FA) identify latent directions (or loadings) either ordered or unordered. The data is then projected onto the latent directions to obtain their projected representations (or scores). For example, PCA solvers usually rank the principal directions by explaining the most to least variance, while ICA solvers usually return independent directions unordered and often with single sources spread across multiple directions as multiple sub-sources, which is of severe detriment to their usability and interpretability. This paper proposes a general framework to enhance latent space representations for improving the interpretability of linear latent spaces. Although the concepts in this paper are language agnostic, the framework is written in Python. This framework automates the clustering and ranking of latent vectors to enhance the latent information per latent vector, as well as, the interpretation of latent vectors. Several innovative enhancements are incorporated including latent ranking (LR), latent scaling (LS), latent clustering (LC), and latent condensing (LCON). For a specified linear latent variable model, LR ranks latent directions according to a specified metric, LS scales latent directions according to a specified metric, LC automatically clusters latent directions into a specified number of clusters, while, LCON automatically determines an appropriate number of clusters into which to condense the latent directions for a given metric. Additional functionality of the framework includes single-channel and multi-channel data sources, data preprocessing strategies such as Hankelisation to seamlessly expand the applicability of linear latent variable models (LLVMs) to a wider variety of data. The effectiveness of LR, LS, and LCON are showcased on two crafted foundational problems with two applied latent variable models, namely, PCA and ICA.

Interpreting Latent Variables in Factor Models via Convex Optimization

Enhancing Interpretability in Factor Analysis by Means of Mathematical Optimization

Optimal Estimation of Large-Dimensional Nonlinear Factor Models

Latent variable graphical model selection via convex optimization

Interpretable Sparse Proximate Factors for Large Dimensions

A unified framework of principal component analysis and factor analysis

Improving the Interpretability of the Variances of Latent Variables by Uniform and Factor-Specific Standardizations of Loadings

Discussion: Latent variable graphical model selection via convex optimization

Exact Exploratory Bi-factor Analysis: A Constraint-based Optimisation Approach

A latent factor approach for prediction from multiple assays

Statistical Inference for Covariate-Adjusted and Interpretable Generalized Factor Model with Application to Testing Fairness

STATISTICAL ANALYSIS OF FACTOR MODELS OF HIGH DIMENSION

Latent Space Perspicacity and Interpretation Enhancement (LS-PIE) Framework

Identifiable and interpretable nonparametric factor analysis

Factor modelling for high-dimensional functional time series

Factor models and variable selection in high-dimensional regression analysis

Identifying Observed Factors in Approximate Factor Models: Estimation and Hypothesis Testing

Huber Principal Component Analysis for Large-dimensional Factor Models

Latent Factor Decomposition Model: Applications for Questionnaire Data

Inferential Theory for Factor Models of Large Dimensions