Abstract:Restricted Boltzmann Machines (RBMs) are probabilistic generative models that can be trained by maximum likelihood in principle, but are usually trained by an approximate algorithm called Contrastive Divergence (CD) in practice. In general, a CD-k algorithm estimates an average with respect to the model distribution using a sample obtained from a k-step Markov Chain Monte Carlo Algorithm (e.g., block Gibbs sampling) starting from some initial configuration. Choices of k typically vary from 1 to 100. This technical report explores if it's possible to leverage a simple approximate sampling algorithm with a modified version of CD in order to train an RBM with k=0. As usual, the method is illustrated on MNIST.

What problem does this paper attempt to address?

This paper explores whether zero - step contrastive divergence (CD - 0) can be used to train restricted Boltzmann machines (RBMs). Traditionally, RBMs can be trained by the principle of maximum likelihood estimation, but in practice, an approximate algorithm - contrastive divergence (CD) - is usually used for training. The CD algorithm estimates the average value relative to the model distribution by obtaining samples from the initial configuration using a k - step Markov chain Monte Carlo (MCMC) algorithm (such as block Gibbs sampling). The common range of k values is from 1 to 100. However, this technical report explores a simplified approximate sampling algorithm, combined with a modified version of the CD method, and attempts to train RBMs without performing any MCMC steps, that is, using the case of k = 0. The author specifically studied RBMs with discrete visible and hidden units, and used Ising - type neurons (taking values of ±1) instead of Bernoulli units (taking values of 0, 1), but pointed out that these two model types are theoretically the same and can be converted to each other through a simple linear transformation. Through this method, the author hopes to understand whether a high - quality approximation of the model distribution is really required during the training process, or whether RBMs can be effectively trained even with a very rough approximation. To verify this hypothesis, the author conducted training experiments on the binary MNIST dataset using the CD - 0 method. The results show that although the quality of the samples generated by the RBMs trained with CD - 0 is not very high, the model can learn to create faithful reconstructions because the observed samples are encoded in the deep valleys of the energy landscape. In addition, the author proposed a simple algorithm called "belief generation" for generating approximate samples from RBMs, which may significantly accelerate the training speed and enable RBMs to be extended to previously intractable problem scales.

Can RBMs be trained with zero step contrastive divergence?

Average Contrastive Divergence for Training Restricted Boltzmann Machines.

Training Restricted Boltzmann Machines with Binary Synapses Using the Bayesian Learning Rule

A Cyclic Contrastive Divergence Learning Algorithm for High-Order RBMs

Research on RBM Training Algorithm with Dynamic Gibbs Sampling

On the convergence properties of contrastive divergence

A Neighbourhood-Based Stopping Criterion for Contrastive Divergence Learning

A Precise Method for RBMs Training Using Phased Curricula

A Novel Restricted Boltzmann Machine Training Algorithm with Dynamic Tempering Chains.

Generative and Discriminative Infinite Restricted Boltzmann Machine Training

Data normalization in the learning of restricted Boltzmann machines

Fast training and sampling of Restricted Boltzmann Machines

Hyperparameters Adaptation for Restricted Boltzmann Machines Based on Free Energy

Adversarial Training Methods for Boltzmann Machines

Contrastive Divergence Learning of Restricted Boltzmann Machine

Training Restricted Boltzmann Machines on Word Observations

A Novel Restricted Boltzmann Machine Training Algorithm with Fast Gibbs Sampling Policy

End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization

Training Restricted Boltzmann Machine Using Gradient Fixing Based Algorithm

Training and Classification using a Restricted Boltzmann Machine on the D-Wave 2000Q

An automatic setting for training restricted boltzmann machine