Abstract:We find that different Deep Neural Networks (DNNs) trained with the same dataset share a common principal subspace in latent spaces, no matter in which architectures (e.g., Convolutional Neural Networks (CNNs), Multi-Layer Preceptors (MLPs) and Autoencoders (AEs)) the DNNs were built or even whether labels have been used in training (e.g., supervised, unsupervised, and self-supervised learning). Specifically, we design a new metric Pdocumentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$${mathcal {P}}$$end{document}-vector to represent the principal subspace of deep features learned in a DNN, and propose to measure angles between the principal subspaces using Pdocumentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$${mathcal {P}}$$end{document}-vectors. Small angles (with cosine close to 1.0) have been found in the comparisons between any two DNNs trained with different algorithms/architectures. Furthermore, during the training procedure from random scratch, the angle decrease from a larger one (70°–80° usually) to the small one, which coincides the progress of feature space learning from scratch to convergence. Then, we carry out case studies to measure the angle between the Pdocumentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$${mathcal {P}}$$end{document}-vector and the principal subspace of training dataset, and connect such angle with generalization performance. Extensive experiments with practically-used Multi-Layer Perceptron (MLPs), AEs and CNNs for classification, image reconstruction, and self-supervised learning tasks on MNIST, CIFAR-10 and CIFAR-100 datasets have been done to support our claims with solid evidences.

Empirical Studies on the Convergence of Feature Spaces in Deep Learning

Exploring the common principal subspace of deep features in neural networks

Consistent Representation Learning for High Dimensional Data Analysis

Subdomain contraction in deep networks for robust representation learning

Convergence Analysis for Deep Sparse Coding via Convolutional Neural Networks

Learning a Deep Structural Subspace Across Hyperspectral Scenes with Cross-Domain VAE

Learning in Feature Spaces via Coupled Covariances: Asymmetric Kernel SVD and Nyström method

Local Feature Discriminant Projection

Deep CovDenseSNN: A Hierarchical Event-Driven Dynamic Framework with Spiking Neurons in Noisy Environment

Convergent Learning: Do different neural networks learn the same representations?

A Unified Feature Representation and Learning Framework for 3D Shape

Sparse Autoencoders Reveal Universal Feature Spaces Across Large Language Models

Evaluating the Stability of Deep Learning Latent Feature Spaces

Deciphering the Feature Representation of Deep Neural Networks for High-Performance AI

Network Comparison Study of Deep Activation Feature Discriminability with Novel Objects

Visualising Feature Learning in Deep Neural Networks by Diagonalizing the Forward Feature Map

Sparse Approximation to the Eigensubspace for Discrimination

The Local Dimension of Deep Manifold.

Face Recognition with Convolutional Neural Networks and subspace learning

A data-driven study of image feature extraction and fusion

Robust dimensionality reduction via feature space to feature space distance metric learning.