A statistical physics approach to learning curves for the Inverse Ising problem

Ludovica Bachschmid-Romano,Manfred Opper
DOI: https://doi.org/10.1088/1742-5468/aa727d
2017-05-16
Abstract:Using methods of statistical physics, we analyse the error of learning couplings in large Ising models from independent data (the inverse Ising problem). We concentrate on learning based on local cost functions, such as the pseudo-likelihood method for which the couplings are inferred independently for each spin. Assuming that the data are generated from a true Ising model, we compute the reconstruction error of the couplings using a combination of the replica method with the cavity approach for densely connected systems. We show that an explicit estimator based on a quadratic cost function achieves minimal reconstruction error, but requires the length of the true coupling vector as prior knowledge. A simple mean field estimator of the couplings which does not need such knowledge is asymptotically optimal, i.e. when the number of observations is much large than the number of spins. Comparison of the theory with numerical simulations shows excellent agreement for data generated from two models with random couplings in the high temperature region: a model with independent couplings (Sherrington-Kirkpatrick model), and a model where the matrix of couplings has a Wishart distribution.
Disordered Systems and Neural Networks,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the error analysis of learning coupling parameters from large - scale Ising models on independent data. Specifically, the research focuses on learning methods based on local cost functions, such as the pseudo - likelihood method, in which the coupling parameters of each spin are independently inferred. The author assumes that the data is generated from a real Ising model and uses the replica method combined with the cavity method to calculate the reconstruction error in densely - connected systems. The main objectives of the paper are: 1. **Evaluating the performance of different estimators**: By calculating the reconstruction error, compare the performance of different estimators (such as the explicit estimator based on the quadratic cost function and the simple mean - field estimator) under different conditions. 2. **Comparison between theory and simulation**: Compare the theoretical predictions with the numerical simulation results to verify the validity of the theoretical model. 3. **Performance analysis in the high - dimensional limit**: Analyze the typical prediction performance of the algorithm when the number of spins \( N \) tends to infinity and the amount of data \( M \) increases proportionally to the number of spins. ### Main contributions - **Theoretical framework**: Propose a statistical physics framework that combines the replica method and the cavity method to analyze the learning performance in large - scale Ising models. - **Optimal estimator**: Prove that the estimator based on the quadratic cost function can achieve the minimum reconstruction error under certain conditions, but the length of the true coupling vector needs to be known. - **Mean - field estimator**: Propose a simple mean - field estimator that does not require prior knowledge and is asymptotically optimal when the number of observations is much larger than the number of spins. - **Numerical verification**: Verify the theoretical predictions through numerical simulations, especially for the performance of two models with random couplings (the Sherrington - Kirkpatrick model and the Wishart distribution model) in the high - temperature region. ### Key formulas - **Probability distribution of the Ising model**: \[ P(\sigma | J, H)=\frac{1}{Z_{\text{Ising}}} \exp\left[\beta \sum_{i < j} J_{ij} \sigma_i \sigma_j+\beta \sum_i H_i \sigma_i\right] \] - **Cost function of maximum - likelihood estimation**: \[ E_{\text{ML}}(J, H)=-\sum_{k = 1}^M \ln P(\sigma_k | J, H) \] - **Pseudo - likelihood cost function**: \[ E(W; \sigma)=-\ln P(\sigma_0 | \sigma_{\backslash 0}, W)=-\beta \sigma_0 \sum_{j \neq 0} \frac{W_j \sigma_j}{\sqrt{N}}+\ln \left(2 \cosh \beta \sum_{j \neq 0} \frac{W_j \sigma_j}{\sqrt{N}}\right) \] - **Reconstruction error**: \[ \epsilon=\frac{1}{N}(W^ * - W)^2 = Y - 2\rho+Q \] where \( Y=\frac{1}{N}(W^*)^2 \), \( Q=\frac{1}{N}(W)^2 \), \( \rho=\frac{1}{N}W^ * \cdot W \). Through these analyses, the paper provides an important theoretical basis and practical tool for understanding and optimizing parameter estimation in Ising models.