Interpretable principal component analysis for multilevel multivariate functional data

Jun Zhang,Greg J Siegle,Tao Sun,Wendy D’andrea,Robert T Krafty
DOI: https://doi.org/10.1093/biostatistics/kxab018
IF: 5.2789
2021-09-21
Biostatistics
Abstract:Summary Many studies collect functional data from multiple subjects that have both multilevel and multivariate structures. An example of such data comes from popular neuroscience experiments where participants’ brain activity is recorded using modalities such as electroencephalography and summarized as power within multiple time-varying frequency bands within multiple electrodes, or brain regions. Summarizing the joint variation across multiple frequency bands for both whole-brain variability between subjects, as well as location–variation within subjects, can help to explain neural reactions to stimuli. This article introduces a novel approach to conducting interpretable principal components analysis on multilevel multivariate functional data that decomposes total variation into subject-level and replicate-within-subject-level (i.e., electrode-level) variation and provides interpretable components that can be both sparse among variates (e.g., frequency bands) and have localized support over time within each frequency band. Smoothness is achieved through a roughness penalty, while sparsity and localization of components are achieved by solving an innovative rank-one based convex optimization problem with block Frobenius and matrix $L_1$-norm-based penalties. The method is used to analyze data from a study to better understand reactions to emotional information in individuals with histories of trauma and the symptom of dissociation, revealing new neurophysiological insights into how subject- and electrode-level brain activity are associated with these phenomena. Supplementary materials for this article are available online.
statistics & probability,mathematical & computational biology
What problem does this paper attempt to address?