On the equivalence of Hopfield Networks and Boltzmann Machines

On the equivalence of Hopfield Networks and Boltzmann Machines
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A specific type of neural network, the Restricted Boltzmann Machine (RBM), is implemented for classification and feature detection in machine learning. RBM is characterized by separate layers of visible and hidden units, which are able to learn efficiently a generative model of the observed data. We study a “hybrid” version of RBM’s, in which hidden units are analog and visible units are binary, and we show that thermodynamics of visible units are equivalent to those of a Hopfield network, in which the N visible units are the neurons and the P hidden units are the learned patterns. We apply the method of stochastic stability to derive the thermodynamics of the model, by considering a formal extension of this technique to the case of multiple sets of stored patterns, which may act as a benchmark for the study of correlated sets. Our results imply that simulating the dynamics of a Hopfield network, requiring the update of N neurons and the storage of N(N-1)/2 synapses, can be accomplished by a hybrid Boltzmann Machine, requiring the update of N+P neurons but the storage of only NP synapses. In addition, the well known glass transition of the Hopfield network has a counterpart in the Boltzmann Machine: It corresponds to an optimum criterion for selecting the relative sizes of the hidden and visible layers, resolving the trade-off between flexibility and generality of the model. The low storage phase of the Hopfield model corresponds to few hidden units and hence a overly constrained RBM, while the spin-glass phase (too many hidden units) corresponds to unconstrained RBM prone to overfitting of the observed data.


💡 Research Summary

The paper introduces a “hybrid” Restricted Boltzmann Machine (HBM) in which the visible layer consists of binary units (σ_i = ±1) while the hidden layer consists of continuous analog units (z_μ). The two layers are fully bipartite: each visible unit is connected to every hidden unit with a symmetric weight ξ_i^μ, but there are no intra‑layer connections. The hidden units evolve according to an Ornstein‑Uhlenbeck stochastic differential equation with a fast time scale T, whereas the visible units follow discrete‑time Glauber dynamics. By assuming that the hidden dynamics are much faster than the visible updates, the authors treat the hidden variables as being in thermal equilibrium for any fixed visible configuration. In this equilibrium the hidden variables are Gaussian with mean ξ·σ and variance β⁻¹, where β is the inverse temperature (noise level).

From the joint distribution P(σ, z) ∝ exp


Comments & Academic Discussion

Loading comments...

Leave a Comment