Bootstrap Confidence Regions for Learned Feature Embeddings

Kris Sankaran
DOI: https://doi.org/10.48550/arXiv.2202.00180
2022-02-01
Abstract:Algorithmic feature learners provide high-dimensional vector representations for non-matrix structured signals, like images, audio, text, and graphs. Low-dimensional projections derived from these representations can be used to explore variation across collections of these data. However, it is not clear how to assess the uncertainty associated with these projections. We adapt methods developed for bootstrapping principal components analysis to the setting where features are learned from non-matrix data. We empirically compare the derived confidence regions in simulations, varying factors that influence both feature learning and the bootstrap. Approaches are illustrated on spatial proteomic data. Code, data, and trained models are released as an R compendium.
Computation
What problem does this paper attempt to address?