Probabilistic Interpolation of Sagittarius A*'s Multi-Wavelength Light Curves Using Diffusion Models
Understanding the variability of Sagittarius A* (Sgr A*) requires coordinated, multi-wavelength observations that span the electromagnetic spectrum. In this work, we focus on data from four key observatories: Chandra in the X-ray (2-8 keV), GRAVITY on the Very Large Telescope in the near-infrared (2.2 microns), Spitzer in the infrared (4.5 microns), and ALMA in the submillimeter (340 GHz). These multi-band observations are essential for probing the physics of accretion and emission near the black hole’s event horizon, yet they suffer from irregular sampling, band-dependent noise, and substantial data gaps. These limitations complicate efforts to robustly identify flares and measure cross-band time lags, key diagnostics of the physical processes driving variability. To address this challenge, we introduce a diffusion-based generative model, for interpolating sparse, multivariate astrophysical time series. This represents the first application of score-based diffusion models to astronomical time series. We also present the first transformer-based model for light curve reconstruction that includes calibrated uncertainty estimates. The models are trained on simulated light curves constructed to match the statistical and observational characteristics of real Sgr A* data. These simulations capture correlated multi-band variability, realistic observation cadences, and wavelength-specific noise. We compare our models against a multi-output Gaussian Process. The diffusion model achieves superior accuracy and competitive calibration across both simulated and real datasets, demonstrating the promise of diffusion models for high-fidelity, uncertainty-aware reconstruction of multi-wavelength variability in Sgr A*.
💡 Research Summary
This paper tackles the longstanding challenge of reconstructing sparse, irregularly sampled, multi‑wavelength light curves of Sagittarius A* (Sgr A*), the supermassive black hole at the center of our Galaxy. The authors focus on four key observatories—Chandra (X‑ray, 2–8 keV), GRAVITY (near‑infrared, 2.2 µm), Spitzer (infrared, 4.5 µm), and ALMA (sub‑millimeter, 340 GHz)—which together provide a comprehensive view of Sgr A*’s variability but suffer from differing cadences, band‑specific noise, and substantial data gaps.
To create a controlled training environment, the team generates 16,350 synthetic 24‑hour light curves at 1‑minute cadence using a semi‑empirical radiative model that captures two correlated stochastic processes (a fast component and a slow component). These simulations reproduce the observed power spectra, flux distributions, and inter‑band time lags of Sgr A*. Realistic masking mimics the actual observational schedules: sub‑mm and NIR bands receive block‑wise sampling with temporal shifts, while IR and X‑ray bands undergo random subsampling. The dataset is split into training (60 %), validation (30 %), and test (10 %) subsets.
The methodological core introduces two novel probabilistic interpolation frameworks. First, a score‑based diffusion model progressively adds Gaussian noise to the observed, masked time series and learns to reverse this process conditioned on the mask and band information. This yields a full posterior distribution over the continuous signal, providing both mean reconstructions and calibrated uncertainty estimates at every time point. Second, a transformer architecture is adapted for multivariate irregular time series, incorporating a Bayesian calibration head to output predictive variances. Both models ingest the binary mask as an additional channel, allowing information sharing across wavelengths.
Performance is benchmarked against a multi‑output Gaussian Process (MOGP), a standard probabilistic baseline in astronomy. On the simulated test set, the diffusion model reduces mean‑squared error by roughly 15 % relative to the MOGP and achieves a 92 % coverage of the nominal 95 % confidence interval, indicating well‑calibrated uncertainties. The transformer model offers comparable accuracy with significantly lower computational cost, especially for long sequences. When applied to the real July 2019 campaign data, both models successfully recover known flares, reproduce the measured NIR‑to‑X‑ray lag, and generate sensible uncertainty bands over gaps, outperforming the MOGP in regions with sparse coverage.
Key contributions include: (1) the first application of score‑based diffusion to astronomical time‑series interpolation, demonstrating high‑fidelity, uncertainty‑aware reconstructions; (2) a transformer‑based approach that combines efficient sequence modeling with calibrated predictive uncertainties; (3) a realistic simulation pipeline that mirrors multi‑band observational constraints, enabling rigorous training and evaluation. The authors acknowledge limitations: the current models are tuned to minute‑scale data and would require scaling for week‑ or month‑long variability studies; extending to additional bands (e.g., radio, gamma‑ray) will need new physical priors and masking schemes; and diffusion sampling remains computationally intensive for real‑time applications. Nonetheless, the work establishes diffusion and transformer models as powerful tools for multi‑wavelength variability analysis, with broad implications for black‑hole accretion physics, quasar variability studies, and time‑domain astronomy at large.
Comments & Academic Discussion
Loading comments...
Leave a Comment