Physics-inspired transformer quantum states via latent imaginary-time evolution
Neural quantum states (NQS) are powerful ansätze in the variational Monte Carlo framework, yet their architectures are often treated as black boxes. We propose a physically transparent framework in which NQS are treated as neural approximations to latent imaginary-time evolution. This viewpoint suggests that standard Transformer-based NQS (TQS) architectures correspond to physically unmotivated effective Hamiltonians dependent on imaginary time in a latent space. Building on this interpretation, we introduce physics-inspired transformer quantum states (PITQS), which enforce a static effective Hamiltonian by sharing weights across layers and improve propagation accuracy via Trotter-Suzuki decompositions without increasing the number of variational parameters. For the frustrated $J_1$-$J_2$ Heisenberg model, our ansätze achieve accuracies comparable to or exceeding state-of-the-art TQS while using substantially fewer variational parameters. This study demonstrates that reinterpreting the deep network structure as a latent cooling process enables a more physically grounded, systematic, and compact design, thereby bridging the gap between black-box expressivity and physically transparent construction.
💡 Research Summary
The paper revisits neural quantum states (NQS) from a physical perspective, interpreting them as neural approximations to a latent imaginary‑time evolution (LITE) process rather than as opaque machine‑learning models. In this view, the standard transformer‑based NQS (TQS) architecture implicitly defines a sequence of effective Hamiltonians that vary with layer depth, which corresponds to a time‑dependent Hamiltonian in the latent space. This layer‑wise variation is physically redundant because the true imaginary‑time cooling of a quantum system is governed by a single, time‑independent Hamiltonian. Consequently, conventional TQS are heavily over‑parameterized: the number of variational parameters grows linearly with the number of layers while providing no systematic accuracy gain.
To address this, the authors propose Physics‑Inspired Transformer Quantum States (PITQS). They enforce a static effective Hamiltonian by sharing all weights across layers, thereby reducing the parameter count by roughly the number of layers. The latent evolution operator is then realized as (\hat U_\theta(\beta)=e^{-\beta \hat H_\theta}) with a single Hamiltonian (\hat H_\theta = -(\hat V_\theta + \hat K_\theta)), where (\hat K_\theta) and (\hat V_\theta) are the non‑local (attention) and local (feed‑forward) components, respectively.
Beyond weight sharing, PITQS improves the accuracy of the imaginary‑time propagation by employing higher‑order Trotter‑Suzuki decompositions (second‑order Strang, fourth‑order Suzuki, and fourth‑order Blanes‑Moan schemes). These decompositions reduce the Trotter error from (\mathcal O(\Delta\tau^2)) to (\mathcal O(\Delta\tau^{m+1})) without introducing additional variational parameters, because the coefficients of the decomposition are fixed analytically.
The authors benchmark PITQS on the frustrated (J_1)–(J_2) Heisenberg model on a (10\times10) square lattice with periodic boundary conditions and (J_2/J_1=0.5). They compare several PITQS variants (Lie–Trotter, Strang, Suzuki, Blanes–Moan) against standard TQS baselines at total imaginary times (\beta=0.5) and (\beta=2.0). With a constant parameter budget of 44,890, PITQS achieves energies comparable to or better than TQS that uses up to 303,260 parameters. For example, at (\beta=2.0) PITQS with a fourth‑order Suzuki decomposition reaches an energy per site of (-0.49697), surpassing the best TQS result of (-0.49652) obtained with 155,620 parameters. Moreover, increasing (\beta) (i.e., adding more layers) improves PITQS accuracy without increasing the number of parameters, whereas TQS shows no systematic improvement despite a larger parameter count.
The paper also clarifies the role of latent tokens: the encoder maps a spin configuration to a set of latent tokens, providing auxiliary degrees of freedom analogous to auxiliary fields in AF‑QMC, but learned end‑to‑end within the variational Monte Carlo (VMC) framework. This design retains sign‑problem‑free VMC while allowing expressive non‑local correlations.
In summary, the work makes three key contributions: (1) it identifies the physical redundancy in conventional TQS arising from layer‑dependent effective Hamiltonians; (2) it introduces a unified LITE framework that casts NQS as a transparent imaginary‑time cooling process; (3) it demonstrates that weight sharing and higher‑order Trotter‑Suzuki decompositions yield a compact, physically motivated ansatz (PITQS) that matches or exceeds state‑of‑the‑art TQS accuracy with far fewer variational parameters. This bridges the gap between black‑box neural network expressivity and physically grounded quantum state construction, offering a systematic pathway for designing efficient, interpretable NQS architectures.
Comments & Academic Discussion
Loading comments...
Leave a Comment