Mean-Field Assisted Deep Boltzmann Learning with Probabilistic Computers

Shuvro Chowdhury,Shaila Niazi,Kerem Y. Camsari

2024-01-04

Abstract:Despite their appeal as physics-inspired, energy-based and generative nature, general Boltzmann Machines (BM) are considered intractable to train. This belief led to simplified models of BMs with restricted intralayer connections or layer-by-layer training of deep BMs. Recent developments in domain-specific hardware -- specifically probabilistic computers (p-computer) with probabilistic bits (p-bit) -- may change established wisdom on the tractability of deep BMs. In this paper, we show that deep and unrestricted BMs can be trained using p-computers generating hundreds of billions of Markov Chain Monte Carlo (MCMC) samples per second, on sparse networks developed originally for use in D-Wave's annealers. To maximize the efficiency of learning the p-computer, we introduce two families of Mean-Field Theory assisted learning algorithms, or xMFTs (x = Naive and Hierarchical). The xMFTs are used to estimate the averages and correlations during the positive phase of the contrastive divergence (CD) algorithm and our custom-designed p-computer is used to estimate the averages and correlations in the negative phase. A custom Field-Programmable-Gate Array (FPGA) emulation of the p-computer architecture takes up to 45 billion flips per second, allowing the implementation of CD-$n$ where $n$ can be of the order of millions, unlike RBMs where $n$ is typically 1 or 2. Experiments on the full MNIST dataset with the combined algorithm show that the positive phase can be efficiently computed by xMFTs without much degradation when the negative phase is computed by the p-computer. Our algorithm can be used in other scalable Ising machines and its variants can be used to train BMs, previously thought to be intractable.

Emerging Technologies,Hardware Architecture,Machine Learning,Neural and Evolutionary Computing

What problem does this paper attempt to address?

The paper mainly discusses how to use probabilistic computers (p-computers) to solve the problem of training Boltzmann Machines (BM). Traditionally, BMs have been considered difficult to handle due to their complex training process, but recently developed specialized hardware, such as p-computers, has provided new possibilities. The paper proposes a deep BM training algorithm that combines Mean-Field Theory (MFT) and utilizes p-computers to efficiently perform negative phase sampling by generating a large number of Markov Chain Monte Carlo (MCMC) samples. Specifically, the contributions of the paper include: 1. Designing a fast digital MCMC sampler based on FPGA to simulate physical p-computers, capable of performing 450 billion Gibbs samples per second. This sampler is used to train deep unrestricted BMs with 2560 nodes and 17984 parameters, and can handle the complete MNIST dataset. 2. Introducing a Contrastive Divergence (CD) algorithm with mixed MFT assistance to simplify the positive phase computation of unrestricted and deep BMs. Hierarchical MFT (HMFT) is proposed to improve correlation estimation, although it requires more computational resources. 3. Demonstrating that the proposed mixed algorithm does not significantly degrade performance in positive phase computation by using MFT, as positive phase correlations are easier to handle through MFT while negative phase depends on Gibbs sampling with p-computers. The paper demonstrates the application of this approach on the MNIST dataset through experiments, proving its effectiveness and efficiency. It also suggests that this approach can be applied to other scalable Ising machines and BMs that were previously considered difficult to train.

Mean-Field Assisted Deep Boltzmann Learning with Probabilistic Computers

Training Deep Boltzmann Networks with Sparse Ising Machines

Training Restricted Boltzmann Machines with Binary Synapses Using the Bayesian Learning Rule

An atomic Boltzmann machine capable of on-chip learning

On the challenges of physical implementations of RBMs

Fast training and sampling of Restricted Boltzmann Machines

A Novel Restricted Boltzmann Machine Training Algorithm with Dynamic Tempering Chains.

Generative and discriminative training of Boltzmann machine through Quantum annealing

Monotone deep Boltzmann machines

A Novel Restricted Boltzmann Machine Training Algorithm with Fast Gibbs Sampling Policy

Increasing Flips per Second and Speed of p-Computers by Using Dilute Magnetic Semiconductors to Implement Binary Stochastic Neurons

A Precise Method for RBMs Training Using Phased Curricula

CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning

Training and Classification using a Restricted Boltzmann Machine on the D-Wave 2000Q

Machine Learning Quantum Systems with Magnetic p-bits

CMOS plus stochastic nanomagnets enabling heterogeneous computers for probabilistic inference and learning

On Training Deep Boltzmann Machines

Ising Model Optimization Problems on a FPGA Accelerated Restricted Boltzmann Machine

End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization

Restricted Boltzmann Machine, recent advances and mean-field theory

Quantum Boltzmann Machine Algorithm with Dimension-Expanded Equivalent Hamiltonian