Physics-Informed Neural Networks for Modeling Galactic Gravitational Potentials
We introduce a physics-informed neural framework for modeling static and time-dependent galactic gravitational potentials. The method combines data-driven learning with embedded physical constraints to capture complex, small-scale features while preserving global physical consistency. We quantify predictive uncertainty through a Bayesian framework, and model time evolution using a neural ODE approach. Applied to mock systems of varying complexity, the model achieves reconstruction errors at the sub-percent level ($0.14%$ mean acceleration error) and improves dynamical consistency compared to analytic baselines. This method complements existing analytic methods, enabling physics-informed baseline potentials to be combined with neural residual fields to achieve both interpretable and accurate potential models.
💡 Research Summary
This paper presents a novel framework that combines physics‑informed neural networks (PINNs), Bayesian inference, and neural ordinary differential equations (NODEs) to model both static and time‑dependent galactic gravitational potentials with high precision and physical consistency. The authors begin by highlighting the central role of the gravitational potential in linking observed stellar and gas motions to the underlying mass distribution, especially the dark matter component. Traditional analytic models (e.g., NFW halos) are interpretable but struggle with substructures and non‑axisymmetric features, while basis‑function expansions (BFEs) improve flexibility at the cost of many modes, potential unphysical terms, and over‑fitting noise.
To overcome these limitations, the authors design a six‑component model: (i) a physics‑informed loss that directly penalizes violations of the fundamental relation a = –∇ϕ, combining absolute and relative acceleration errors; (ii) a compactified 5‑dimensional spherical coordinate system (two radial coordinates r_i, r_e and three angular coordinates s, t, u) that maps the infinite galactic domain to a bounded interval, improving numerical conditioning; (iii) a radial‑scaling function n(x) that extracts the dominant large‑scale trend from the potential, allowing the network to learn only a scaled residual ˜ϕ_NN; (iv) analytic fusing, where a known analytic baseline potential ϕ_AB (e.g., a low‑order BFE or NFW halo) is added to the learned residual, preserving interpretability while delegating fine‑grained structure to the neural network; (v) a Bayesian neural network (BNN) implementation using NumPyro, with truncated‑normal priors on analytic parameters and Gaussian priors on network weights, optimized via stochastic variational inference (SVI) and a diagonal Gaussian guide (AutoNormal). Training proceeds in two stages—first a narrow weight prior to let the analytic baseline dominate, then a relaxed prior to capture residuals—ensuring stable posterior evolution; (vi) a NODE formulation for time dependence, where the scaled residual potential evolves as ˜ϕ_NN(x,t)=˜ϕ_NN0(x)+∫₀ᵗ f_NN(x,τ)dτ, with f_NN learned by a separate network and integrated numerically via Gauss‑Legendre quadrature. This approach respects causality and yields a smooth temporal trajectory, unlike naïve concatenation of time as an input feature.
Implementation details: the network is built in JAX with Optax, trained using Adam (initial LR = 3×10⁻³, halved every 1 000 epochs). All experiments run on a standard Apple M2 CPU, with training times of a few minutes. Code and data are publicly released on GitHub and Zenodo.
Performance is evaluated on a suite of mock tests, the most demanding being a Milky Way–Large Magellanic Cloud (MW–LMC) system. For the static case, a 4‑layer network with 128 neurons per layer is trained for 10 000 epochs on 4 096 acceleration samples drawn from a density‑weighted rejection sampler. The model achieves a mean acceleration error of 0.14 % and reconstructs the LMC‑induced perturbation and residual substructures (e.g., bulge, nucleus) with sub‑kiloparsec accuracy. Dynamical consistency is assessed by integrating test‑particle orbits in the learned potential; the posterior‑mean orbit deviates by at most 1 kpc over 1 Gyr, a dramatic improvement over a near‑truth analytic LMC model that deviates by ~20 kpc. Energy drift along the reconstructed orbit remains below 0.2 % throughout.
Bayesian inference recovers the LMC’s mass, scale radius, and Galactocentric distance to within 1.5 % of the true values, with all true parameters lying inside the 2‑σ posterior intervals. Posterior samples are generated by reconstructing the full potential (analytic baseline plus neural residual) and fitting a standard MW‑LMC parametric model while holding MW parameters fixed.
For the time‑dependent case, six temporal snapshots (each with 1 024 samples) are used, and the NODE learns the evolution backwards from the present. Reconstruction errors stay below 1 % across radii and epochs, with interpolation (within training times) more accurate than extrapolation.
A systematic study of network size shows that even shallow, narrow networks (e.g., 2 layers, 32 neurons) achieve sub‑percent MAE and sub‑kiloparsec mean orbit deviation (MOD), with training converging in ~3 minutes. Larger networks marginally improve accuracy but increase training time, confirming that physics‑informed constraints keep the model compact.
In the discussion, the authors acknowledge that the current tests use idealized, noise‑free mock data. Future work will involve applying the framework to realistic, noisy cosmological simulations and to actual observational data (e.g., Gaia stellar accelerations). They also suggest extending the loss to density‑based terms, since stellar density is more directly observable than acceleration, which would broaden applicability to surveys lacking precise acceleration measurements.
Overall, the paper delivers a comprehensive, physically grounded machine‑learning pipeline that simultaneously offers interpretability (through analytic baselines), high‑fidelity reconstruction (via neural residuals), and quantified uncertainty (via Bayesian inference). Its successful application to the MW–LMC system demonstrates the potential to revolutionize galactic dynamics studies, enabling precise mapping of dark‑matter distributions and time‑varying gravitational fields in the era of large‑scale astrometric surveys.
Comments & Academic Discussion
Loading comments...
Leave a Comment