Neural Feature Learning in Function Space

Xiangxiang Xu,Lizhong Zheng
2024-05-27
Abstract:We present a novel framework for learning system design with neural feature extractors. First, we introduce the feature geometry, which unifies statistical dependence and feature representations in a function space equipped with inner products. This connection defines function-space concepts on statistical dependence, such as norms, orthogonal projection, and spectral decomposition, exhibiting clear operational meanings. In particular, we associate each learning setting with a dependence component and formulate learning tasks as finding corresponding feature approximations. We propose a nesting technique, which provides systematic algorithm designs for learning the optimal features from data samples with off-the-shelf network architectures and optimizers. We further demonstrate multivariate learning applications, including conditional inference and multimodal learning, where we present the optimal features and reveal their connections to classical approaches.
Machine Learning
What problem does this paper attempt to address?
The paper primarily aims to address the following issues: 1. **Framework Design**: Establish a novel framework for designing learning systems that include neural feature extractors. This framework separates feature learning from the actual application of features (such as building inference models), allowing features learned from data samples to be assembled into different inference models without retraining. 2. **Statistical Dependency Representation**: By introducing the concept of "feature geometry," statistical dependencies and feature representations are unified within the function space. This allows statistical dependencies to be represented through operations in the function space, and the feature learning problem can be transformed into an approximation of statistical dependencies. 3. **Optimal Feature Learning**: Proposes a nested technique to decompose statistical dependencies and learn the associated feature representations. This technique provides a systematic approach to constructing training objectives and developing learning algorithms, allowing the use of existing deep learning practices (such as network architecture and optimizer design) to learn optimal features. 4. **Multivariate Learning Applications**: Demonstrates applications in multivariate learning problems, including scenarios such as conditional inference and multimodal learning. The study explores the optimal features in these scenarios and reveals their connections to classical methods. In summary, the core issue of this paper is to address how to effectively learn useful feature representations from data through theoretical and algorithmic innovations, especially for complex data structures and dependencies. These feature representations not only capture the statistical properties of the data but can also be flexibly applied to various inference tasks.