Abstract:We present a novel framework for learning system design with neural feature extractors. First, we introduce the feature geometry, which unifies statistical dependence and feature representations in a function space equipped with inner products. This connection defines function-space concepts on statistical dependence, such as norms, orthogonal projection, and spectral decomposition, exhibiting clear operational meanings. In particular, we associate each learning setting with a dependence component and formulate learning tasks as finding corresponding feature approximations. We propose a nesting technique, which provides systematic algorithm designs for learning the optimal features from data samples with off-the-shelf network architectures and optimizers. We further demonstrate multivariate learning applications, including conditional inference and multimodal learning, where we present the optimal features and reveal their connections to classical approaches.

What problem does this paper attempt to address?

The paper primarily aims to address the following issues: 1. **Framework Design**: Establish a novel framework for designing learning systems that include neural feature extractors. This framework separates feature learning from the actual application of features (such as building inference models), allowing features learned from data samples to be assembled into different inference models without retraining. 2. **Statistical Dependency Representation**: By introducing the concept of "feature geometry," statistical dependencies and feature representations are unified within the function space. This allows statistical dependencies to be represented through operations in the function space, and the feature learning problem can be transformed into an approximation of statistical dependencies. 3. **Optimal Feature Learning**: Proposes a nested technique to decompose statistical dependencies and learn the associated feature representations. This technique provides a systematic approach to constructing training objectives and developing learning algorithms, allowing the use of existing deep learning practices (such as network architecture and optimizer design) to learn optimal features. 4. **Multivariate Learning Applications**: Demonstrates applications in multivariate learning problems, including scenarios such as conditional inference and multimodal learning. The study explores the optimal features in these scenarios and reveals their connections to classical methods. In summary, the core issue of this paper is to address how to effectively learn useful feature representations from data through theoretical and algorithmic innovations, especially for complex data structures and dependencies. These feature representations not only capture the statistical properties of the data but can also be flexibly applied to various inference tasks.

Neural Feature Learning in Function Space

Half-Space Feature Learning in Neural Networks

Feature space learning model

Universal Neural Functionals

Feature Expansion for Graph Neural Networks

Neural Operator: Learning Maps Between Function Spaces

Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions

Function-space Parameterization of Neural Networks for Sequential Learning

Subspace Structural Constraint-Based Discriminative Feature Learning Via Nonnegative Low Rank Representation

Latent Functional Maps: a spectral framework for representation alignment

Goal-oriented Feature Extraction: a novel approach for enhancing data-driven surrogate model

Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space

Learning and Memory of Spatial Relationship by a Neural Network with Sparse Features

Feature learning in feature-sample networks using multi-objective optimization

Deep Graphical Feature Learning for the Feature Matching Problem

A Multiobjective Sparse Feature Learning Model for Deep Neural Networks

Critical feature learning in deep neural networks

Neural Feature Search: A Neural Architecture for Automated Feature Engineering

Dependence Induced Representations

Algorithm for Orthogonal Matrix Nearness and Its Application to Feature Representation

Neural Subspaces for Light Fields