What problem does this paper attempt to address?

The core problem that this paper attempts to solve is how to learn unnormalized distributions through Noise - Contrastive Estimation (NCE). Specifically, the main contributions of the paper include: 1. **Unified Perspective**: The paper provides a unified perspective of the estimator family based on NCE, which were independently proposed and studied by different research communities in the past. This unified perspective offers new insights into existing estimators. 2. **Proposal of New Variants**: The paper proposes α - Centered NCE (α - CentNCE) and f - Conditional NCE (f - CondNCE). These two new variants help to clarify previously unrecognized and potentially misleading connections, especially regarding the relationships among various estimators for learning unnormalized distributions. 3. **Theoretical Analysis**: For exponential family distributions, the paper establishes the finite - sample convergence rate under a set of regularity assumptions. In particular, for f - CondNCE, the paper points out that its behavior is different from what was claimed in the original literature, and in the case of small noise, the variance of f - CondNCE will diverge. 4. **Finite - Sample Guarantees**: As a specific result of the above - mentioned connections, the paper establishes finite - sample convergence guarantees for the proposed estimators for learning bounded exponential family distributions. To the best of the authors' knowledge, this is the first time such guarantees have been given for almost all considered NCE estimators. ### Specific Problem Summary #### 1. Research Background Unnormalized distributions are widely used in fields such as generative modeling, density estimation, and reinforcement learning, such as energy - based models. However, due to the computational complexity of the normalization constant, parameter estimation faces significant challenges. #### 2. Main Methods The paper focuses on the estimator family based on NCE and introduces two new variants: - **α - Centered NCE**: Through the α - centering transformation, the given parametric model is normalized, thereby leading to a new NCE variant. - **f - Conditional NCE**: Through the conditional distribution π(y|x), a generalized conditional NCE objective function is introduced, aiming to contrast the joint distribution qd(x)π(y|x) with qd(y)π(x|y). #### 3. Theoretical Contributions - **Unifying Existing Methods**: Reveals the connections among multiple estimators for learning unnormalized distributions, including Maximum Likelihood Estimation (MLE), Monte Carlo Maximum Likelihood Estimation (MC - MLE), and Global GISO, etc. - **Finite - Sample Analysis**: Establishes the finite - sample convergence rate of NCE estimators under exponential family distributions, ensuring theoretical reliability. #### 4. Experimental Verification Through theoretical analysis and experimental verification, the paper demonstrates the effectiveness and superiority of the newly proposed methods in practical applications. In conclusion, through introducing new NCE variants and establishing strict theoretical analysis, this paper provides a unified and effective framework for learning unnormalized distributions.

A Unified View on Learning Unnormalized Distributions via Noise-Contrastive Estimation

On the connection between Noise-Contrastive Estimation and Contrastive Divergence

Learning Unnormalized Statistical Models Via Compositional Optimization

Fully Variational Noise-Contrastive Estimation

Learning Regularized Noise Contrastive Estimation for Robust Network Embedding.

Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees

HLA B27-associated rheumatic diseases with severe cardiac bradyarrhythmias. Clinical features and prevalence in 223 men with permanent pacemakers.

Towards a Unified Framework of Contrastive Learning for Disentangled Representations

Adversarial Contrastive Estimation

Unnormalized Variational Bayes

On Computationally Efficient Learning of Exponential Family Distributions

Learning to See by Looking at Noise

Minimax Optimal rates of convergence in the shuffled regression, unlinked regression, and deconvolution under vanishing noise

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Adversarially Contrastive Estimation of Conditional Neural Processes

Learning with Noisy Labels: Interconnection of Two Expectation-Maximizations

High Probability Convergence Bounds for Non-convex Stochastic Gradient Descent with Sub-Weibull Noise

Understanding Contrastive Learning via Distributionally Robust Optimization

Convex and Non-convex Approaches for Statistical Inference with Class-Conditional Noisy Labels

Noise Contrastive Estimation for Scalable Linear Models for One-Class Collaborative Filtering

Instance-dependent Label Distribution Estimation for Learning with Label Noise