Drifting to Boltzmann: Million-Fold Acceleration in Boltzmann Sampling with Force-Guided Drifting
Sampling molecular conformations from the Boltzmann distribution is essential for computational chemistry, but iterative diffusion methods are prohibitively slow. Drifting Models offer one-step generation, yet their equilibrium matches the \emph{training} distribution, which may deviate from the true Boltzmann distribution due to sampling bias. We introduce Drifting Models to molecular conformation generation for the first time, establishing a theoretical bridge via the \emph{Drifting Score Identity}: for Gaussian kernels, the drifting field’s attraction equals a kernel-weighted average of \emph{any} distribution’s score function. Substituting molecular force labels – which directly encode the Boltzmann score – yields the \emph{Drifting Force Identity} and decomposes the field into standard drift plus a Boltzmann correction. We further discover a striking phenomenon unique to molecular systems: force incorporation’s effectiveness \emph{reverses across representations}. In coordinate space, Force-Interpolated Drifting (FI) dominates by blending physical force directions with data displacements. In distance feature space, Force-Aligned Kernel (FK) achieves superior accuracy by modifying only kernel weights, thereby preserving the manifold of geometrically valid molecules. On MD17 Ethanol, both approaches achieve one-step generation with over 1000x speedup relative to recent score-matching methods with Boltzmann guiding, providing more than million-fold acceleration over traditional molecular dynamics, while ensuring perfect structural validity and distributional accuracy rivaling multi-step methods.
💡 Research Summary
Background and Motivation
Sampling molecular conformations from the Boltzmann distribution p_Boltz(x) ∝ exp(−E(x)/k_BT) is a cornerstone of computational chemistry, enabling the prediction of thermodynamic observables, free energies, and material properties. Classical approaches such as molecular dynamics (MD) and Monte Carlo (MC) can, in principle, generate exact Boltzmann samples, but they suffer from long autocorrelation times and difficulty escaping metastable basins, making them prohibitively expensive for realistic systems. Recent deep generative models—normalizing flows, diffusion models, and score‑matching methods—have demonstrated impressive ability to produce diverse molecular structures in a few steps. However, these models are trained to match the empirical data distribution p_data, which often deviates from the true Boltzmann distribution due to finite sampling, enhanced‑sampling bias, or non‑equilibrium data collection. Consequently, a key challenge is to inject physical energy or force information directly into the generative process so that the learned model targets p_Boltz instead of p_data.
Drifting Models Overview
Drifting models provide a one‑step generation framework. A generator f_θ(ε) maps a noise vector ε ∼ N(0,I) to a sample x, inducing a push‑forward distribution q_θ. Training is guided by a kernel‑based drifting field V_{p,q}(x) defined as
V_{p,q}(x) = (1/Z_p Z_q) E_{y⁺∼p, y⁻∼q}
Comments & Academic Discussion
Loading comments...
Leave a Comment