Generalized Score Matching

Jiazhen Xu,Janice L. Scealy,Andrew T. A. Wood,Tao Zou
2024-04-21
Abstract:Score matching is an estimation procedure that has been developed for statistical models whose probability density function is known up to proportionality but whose normalizing constant is intractable, so that maximum likelihood is difficult or impossible to implement. To date, applications of score matching have focused more on continuous IID models. Motivated by various data modelling problems, this article proposes a unified asymptotic theory of generalized score matching developed under the independence assumption, covering both continuous and discrete response data, thereby giving a sound basis for score-matchingbased inference. Real data analyses and simulation studies provide convincing evidence of strong practical performance of the proposed methods.
Methodology,Statistics Theory,Applications,Computation
What problem does this paper attempt to address?
This paper attempts to address the problem of parameter estimation in statistical models when the probability density function is known but the normalization constant is difficult to compute. Specifically, the paper proposes a Generalized Score Matching method that can handle both continuous and discrete response data and has a unified asymptotic theoretical foundation under the independence assumption. Additionally, the paper introduces score matching-based inference methods for testing independence in certain data-dependent models. ### Main Contributions: 1. **Extension of Score Matching to Ordinal Data**: The paper proposes a new score matching method for ordinal data, applicable to both univariate and multivariate ordinal data. 2. **Unified Asymptotic Theoretical Framework**: A unified asymptotic theoretical framework for the generalized score matching estimator and its related hypothesis tests is developed. 3. **Score Matching-Based Independence Test**: A score matching-based method is proposed to test independence in specific data-dependent models, particularly in exponential family autoregressive models. ### Application Background: - **Continuous Data**: Traditional score matching methods are primarily applied to continuous independent and identically distributed (IID) models. - **Discrete Data**: The paper extends the score matching method to handle discrete data, especially ordinal data. - **Complex Data Structures**: For example, compositional data vectors in geochemical datasets, where the components are non-negative and sum to 1, distributed across different spatial locations. ### Methodological Innovations: - **Forward Difference Operator**: A new linear operator, the forward difference operator, is proposed to address the score matching problem for discrete data. - **Square Root Transformation**: Compositional data vectors are mapped onto the sphere to apply the von Mises-Fisher autoregressive model. ### Experimental Validation: - **Empirical Analysis**: The proposed methods are validated through empirical data analysis and simulation studies, demonstrating their strong performance in practice. - **Numerical Comparison**: The performance of score matching-based Wald statistics and modified score matching statistics is compared. ### Theoretical Support: - **Asymptotic Normality**: The asymptotic normality of the score matching estimator is proven. - **Hypothesis Testing**: Score matching-based Wald tests and modified score matching tests are proposed, and their asymptotic distributions are provided. In summary, this paper addresses the problem of parameter estimation in complex data structures and when the normalization constant is difficult to compute by proposing a generalized score matching method, providing a unified theoretical framework and practical application validation.