A Tutorial on Principal Component Analysis

Jonathon Shlens
DOI: https://doi.org/10.48550/arXiv.1404.1100
IF: 5.414
2014-04-03
Machine Learning
Abstract:Principal component analysis (PCA) is a mainstay of modern data analysis - a black box that is widely used but (sometimes) poorly understood. The goal of this paper is to dispel the magic behind this black box. This manuscript focuses on building a solid intuition for how and why principal component analysis works. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind PCA. This tutorial does not shy away from explaining the ideas informally, nor does it shy away from the mathematics. The hope is that by addressing both aspects, readers of all levels will be able to gain a better understanding of PCA as well as the when, the how and the why of applying this technique.
What problem does this paper attempt to address?