Composite Likelihood Estimation for Restricted Boltzmann machines

Muneki Yasuda,Shun Kataoka,Yuji Waizumi,Kazuyuki Tanaka
DOI: https://doi.org/10.48550/arXiv.1406.6176
2014-06-24
Abstract:Learning the parameters of graphical models using the maximum likelihood estimation is generally hard which requires an approximation. Maximum composite likelihood estimations are statistical approximations of the maximum likelihood estimation which are higher-order generalizations of the maximum pseudo-likelihood estimation. In this paper, we propose a composite likelihood method and investigate its property. Furthermore, we apply our composite likelihood method to restricted Boltzmann machines.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the computational problems encountered when using Maximum Likelihood Estimation (ML) to learn parameters in graphical models. Specifically, ML estimation usually requires the calculation of the normalization constant and its gradient, which are often intractable in practical applications. To solve this problem, the author introduced a Composite Likelihood Estimation (CL) method and studied its properties. #### Main problems: 1. **Computational complexity**: The computational complexity of ML estimation is high. Especially in high - dimensional data and complex models, it is very difficult to calculate the normalization constant and its gradient. 2. **Limitations of existing methods**: Although Pseudo - Likelihood Estimation (PL) has a fast calculation speed, its estimation accuracy is low. Therefore, a method that can maintain high computational efficiency and improve estimation accuracy is needed. #### Solutions: - **Composite Likelihood Estimation (CL)**: CL is a statistical approximation method and can be regarded as a high - order generalization of PL. By dividing the variable set into several "blocks" and performing likelihood estimation on these blocks, CL can balance computational complexity and estimation accuracy to a certain extent. - **Systematic block selection**: The author proposed a systematic block selection method, so that as the block size increases, CL can gradually approximate the true log - likelihood function. - **Application to Restricted Boltzmann Machines (RBM)**: The author applied the CL method to the learning of RBM and verified its effectiveness through numerical experiments. #### Formula representation: - True log - likelihood function: \[ L_{\text{ML}}(\theta)=\sum_x Q(x)\ln P(x|\theta) \] - Composite likelihood function: \[ L_F(\theta)=\Lambda_F\sum_{c\in F}\sum_x Q(x)\ln P(x_c|x_{\bar{c}},\theta) \] where \(\Lambda_F = |F|^{-1}\), \(F\) is the set of all blocks, \(c\in F\) represents a block, and \(x_c\) and \(x_{\bar{c}}\) represent variables inside and outside the block respectively. #### Experimental results: Through numerical experiments, the author showed the advantages of high - order CL estimation in convergence speed and estimation accuracy. For example, after 50,000 iterations, the true log - likelihood values obtained by different - order CL estimations are as follows: - Exact ML estimation: - 1.741 - First - order CL estimation: - 1.796 - Second - order CL estimation: - 1.742 - Third - order CL estimation: - 1.741 From these results, it can be seen that as the CL order increases, the estimation accuracy gradually approaches the exact ML estimation. In conclusion, this paper provides an effective way to improve estimation accuracy while maintaining computational efficiency by introducing the CL method, which is especially suitable for learning tasks of high - dimensional data and complex models.