Abstract:Score matching is an estimation procedure that has been developed for statistical models whose probability density function is known up to proportionality but whose normalizing constant is intractable, so that maximum likelihood is difficult or impossible to implement. To date, applications of score matching have focused more on continuous IID models. Motivated by various data modelling problems, this article proposes a unified asymptotic theory of generalized score matching developed under the independence assumption, covering both continuous and discrete response data, thereby giving a sound basis for score-matchingbased inference. Real data analyses and simulation studies provide convincing evidence of strong practical performance of the proposed methods.

What problem does this paper attempt to address?

This paper attempts to address the problem of parameter estimation in statistical models when the probability density function is known but the normalization constant is difficult to compute. Specifically, the paper proposes a Generalized Score Matching method that can handle both continuous and discrete response data and has a unified asymptotic theoretical foundation under the independence assumption. Additionally, the paper introduces score matching-based inference methods for testing independence in certain data-dependent models. ### Main Contributions: 1. **Extension of Score Matching to Ordinal Data**: The paper proposes a new score matching method for ordinal data, applicable to both univariate and multivariate ordinal data. 2. **Unified Asymptotic Theoretical Framework**: A unified asymptotic theoretical framework for the generalized score matching estimator and its related hypothesis tests is developed. 3. **Score Matching-Based Independence Test**: A score matching-based method is proposed to test independence in specific data-dependent models, particularly in exponential family autoregressive models. ### Application Background: - **Continuous Data**: Traditional score matching methods are primarily applied to continuous independent and identically distributed (IID) models. - **Discrete Data**: The paper extends the score matching method to handle discrete data, especially ordinal data. - **Complex Data Structures**: For example, compositional data vectors in geochemical datasets, where the components are non-negative and sum to 1, distributed across different spatial locations. ### Methodological Innovations: - **Forward Difference Operator**: A new linear operator, the forward difference operator, is proposed to address the score matching problem for discrete data. - **Square Root Transformation**: Compositional data vectors are mapped onto the sphere to apply the von Mises-Fisher autoregressive model. ### Experimental Validation: - **Empirical Analysis**: The proposed methods are validated through empirical data analysis and simulation studies, demonstrating their strong performance in practice. - **Numerical Comparison**: The performance of score matching-based Wald statistics and modified score matching statistics is compared. ### Theoretical Support: - **Asymptotic Normality**: The asymptotic normality of the score matching estimator is proven. - **Hypothesis Testing**: Score matching-based Wald tests and modified score matching tests are proposed, and their asymptotic distributions are provided. In summary, this paper addresses the problem of parameter estimation in complex data structures and when the normalization constant is difficult to compute by proposing a generalized score matching method, providing a unified theoretical framework and practical application validation.

Generalized Score Matching

Interpretation and Generalization of Score Matching

Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Diffusions

Provable benefits of score matching

Concrete Score Matching: Generalized Score Matching for Discrete Data

Is Score Matching Suitable for Estimating Point Processes?

Two-Stage Maximum Score Estimator

Estimation of High-Dimensional Graphical Models Using Regularized Score Matching

Sliced Score Matching: A Scalable Approach to Density and Score Estimation

Efficient Score Matching with Deep Equilibrium Layers

Nonparametric Score Estimators

Target Score Matching

Neural Score Matching for High-Dimensional Causal Inference

Optimal score estimation via empirical Bayes smoothing

Generalized score test of homogeneity for mixed effects models

Sample Complexity Bounds for Score-Matching: Causal Discovery and Generative Modeling

Generalized autoregressive score models with applications ∗

Conditional score matching for high-dimensional partial graphical models

Optimal convex $M$-estimation via score matching

A Generalised Matching Distribution for the Problem of Coincidences

Variational Hamiltonian Monte Carlo via Score Matching