Efficient Training of Boltzmann Generators Using Off-Policy Log-Dispersion Regularization

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Sampling from unnormalized probability densities is a central challenge in computational science. Boltzmann generators are generative models that enable independent sampling from the Boltzmann distribution of physical systems at a given temperature. However, their practical success depends on data-efficient training, as both simulation data and target energy evaluations are costly. To this end, we propose off-policy log-dispersion regularization (LDR), a novel regularization framework that builds on a generalization of the log-variance objective. We apply LDR in the off-policy setting in combination with standard data-based training objectives, without requiring additional on-policy samples. LDR acts as a shape regularizer of the energy landscape by leveraging additional information in the form of target energy labels. The proposed regularization framework is broadly applicable, supporting unbiased or biased simulation datasets as well as purely variational training without access to target samples. Across all benchmarks, LDR improves both final performance and data efficiency, with sample efficiency gains of up to one order of magnitude.

💡 Research Summary

The paper addresses the critical challenge of data‑efficient training for Boltzmann generators, which are invertible generative models capable of producing independent samples from the Boltzmann distribution of physical systems. Traditional training relies either on equilibrium samples (data‑based objectives) or on variational objectives that require repeated costly energy evaluations. Both approaches suffer from high sample complexity, especially when energy evaluations are expensive.

To overcome this limitation, the authors introduce Off‑policy Log‑Dispersion Regularization (LDR), a novel regularization framework that leverages target energy labels already present in simulation datasets without requiring additional on‑policy samples. The core idea is to define a family of log‑dispersion objectives that measure the dispersion of the quantity
(f_{\theta}(x)= -\log q_{\theta}(x) - E(x)/(k_{B}T))
around its mean under an arbitrary reference distribution (r(x)). The generalized objective is
\

Efficient Training of Boltzmann Generators Using Off-Policy Log-Dispersion Regularization

💡 Research Summary

Comments & Academic Discussion

Leave a Comment