Evaluating the Stability of Deep Learning Latent Feature Spaces

Ademide O. Mabadeje,Michael J. Pyrcz
2024-08-21
Abstract:High-dimensional datasets present substantial challenges in statistical modeling across various disciplines, necessitating effective dimensionality reduction methods. Deep learning approaches, notable for their capacity to distill essential features from complex data, facilitate modeling, visualization, and compression through reduced dimensionality latent feature spaces, have wide applications from bioinformatics to earth sciences. This study introduces a novel workflow to evaluate the stability of these latent spaces, ensuring consistency and reliability in subsequent analyses. Stability, defined as the invariance of latent spaces to minor data, training realizations, and parameter perturbations, is crucial yet often overlooked. Our proposed methodology delineates three stability types, sample, structural, and inferential, within latent spaces, and introduces a suite of metrics for comprehensive evaluation. We implement this workflow across 500 autoencoder realizations and three datasets, encompassing both synthetic and real-world scenarios to explain latent space dynamics. Employing k-means clustering and the modified Jonker-Volgenant algorithm for class alignment, alongside anisotropy metrics and convex hull analysis, we introduce adjusted stress and Jaccard dissimilarity as novel stability indicators. Our findings highlight inherent instabilities in latent feature spaces and demonstrate the workflow's efficacy in quantifying and interpreting these instabilities. This work advances the understanding of latent feature spaces, promoting improved model interpretability and quality control for more informed decision-making for diverse analytical workflows that leverage deep learning.
Machine Learning
What problem does this paper attempt to address?