Abstract:Restricted Boltzmann Machines (RBMs) are effective tools for modeling complex systems and deriving insights from data. However, training these models with highly structured data presents significant challenges due to the slow mixing characteristics of Markov Chain Monte Carlo processes. In this study, we build upon recent theoretical advancements in RBM training, to significantly reduce the computational cost of training (in very clustered datasets), evaluating and sampling in RBMs in general. The learning process is analogous to thermodynamic continuous phase transitions observed in ferromagnetic models, where new modes in the probability measure emerge in a continuous manner. Such continuous transitions are associated with the critical slowdown effect, which adversely affects the accuracy of gradient estimates, particularly during the initial stages of training with clustered data. To mitigate this issue, we propose a pre-training phase that encodes the principal components into a low-rank RBM through a convex optimization process. This approach enables efficient static Monte Carlo sampling and accurate computation of the partition function. We exploit the continuous and smooth nature of the parameter annealing trajectory to achieve reliable and computationally efficient log-likelihood estimations, enabling online assessment during the training, and propose a novel sampling strategy named parallel trajectory tempering (PTT) which outperforms previously optimized MCMC methods. Our results show that this training strategy enables RBMs to effectively address highly structured datasets that conventional methods struggle with. We also provide evidence that our log-likelihood estimation is more accurate than traditional, more computationally intensive approaches in controlled scenarios. The PTT algorithm significantly accelerates MCMC processes compared to existing and conventional methods.

Hyperparameters Adaptation for Restricted Boltzmann Machines Based on Free Energy

A Method of Adaptive Hyperparameter Optimization for Deep Generative Models

Training Restricted Boltzmann Machines with Binary Synapses Using the Bayesian Learning Rule

Pre-training the Deep Generative Models with Adaptive Hyperparameter Optimization

Restricted Boltzmann machine based algorithm for multi-objective optimization

Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates

Fast training and sampling of Restricted Boltzmann Machines

Efficient Hyperparameter Optimization for Deep Learning Algorithms Using Deterministic RBF Surrogates

Using Hierarchical Dirichlet Processes to Regulate Weight Parameters of Restricted Boltzmann Machines

An Adaptive Deep Belief Network With Sparse Restricted Boltzmann Machines

Monotone deep Boltzmann machines

Efficient Hyper-parameter Optimization for NLP Applications.

Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization

A Metaheuristic-Driven Approach to Fine-Tune Deep Boltzmann Machines

Restricted Boltzmann Machine with Adaptive Local Hidden Units.

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

Properties and Bayesian fitting of restricted Boltzmann machines

Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization

Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

Bayesian Hyperparameter Optimization with BoTorch, GPyTorch and Ax