A joint diffusion approach to multi-modal inference in inertial confinement fusion

A joint diffusion approach to multi-modal inference in inertial confinement fusion
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A combination of physics-based simulation and experiments has been critical to achieving ignition in inertial confinement fusion (ICF). Simulation and experiment both produce a mixture of scalar and images outputs, however only a subset of simulated data are available experimentally. We introduce a generative framework, called JointDiff, which enables predictions of conditional simulation input and output distributions from partial, multi-modal observations. The model leverages joint diffusion to unify forward surrogate modeling, inverse inference, and output imputation into one architecture. We train our model on a large ensemble of three-dimensional Multi-Rocket Piston simulations and demonstrate high accuracy, statistical robustness, and transferability to experiments performed at the National Ignition Facility (NIF). This work establishes JointDiff as a flexible generative surrogate for multi-modal scientific tasks, with implications for understanding diagnostic constraints, aligning simulation to experiment, and accelerating ICF design.


💡 Research Summary

The paper introduces JointDiff, a joint diffusion generative framework designed to bridge the gap between high‑fidelity inertial confinement fusion (ICF) simulations and limited experimental diagnostics. Traditional multi‑modal models either embed heterogeneous data into a shared latent space or align pre‑trained representations, which can miss subtle inter‑modal dependencies. JointDiff directly learns the full joint distribution of scalar parameters and image‑based diagnostics (primary and down‑scattered neutron images) using a single diffusion process.

The architecture is a multi‑modal U‑Net: convolutional encoders/decoders handle image channels, while fully‑connected networks process scalar inputs and outputs. Random binary masks are applied to both inputs and outputs during training, enabling the model to learn three tasks simultaneously: (1) forward surrogate modeling (inputs → full outputs), (2) inverse inference (outputs → inputs), and (3) imputation of missing outputs given partial observations. Diffusion time is encoded with sinusoidal positional embeddings, and mask embeddings are concatenated at each convolutional layer, allowing the network to condition on whatever data are available at inference time.

Training data consist of 443 000 three‑dimensional Multi‑Rocket Piston (RP) simulations. Each sample includes 28 input scalars (e.g., initial pressure, adiabat, drive symmetry modes), 12 output scalars (e.g., total neutron yield, bang‑time, areal density ρR, residual kinetic energy), and three line‑of‑sight images, each with primary and down‑scattered neutron channels. This rich dataset provides the necessary diversity for the model to capture complex physical correlations.

Evaluation shows that JointDiff achieves very high forward prediction performance: R² values range from 0.935 to 0.999 across scalars, and 92.8 % of ground‑truth scalar values fall within two standard deviations of the predicted distribution. Image generation is also accurate; mean absolute error is low, and visual inspection confirms that the model reproduces intensity profiles across a three‑order‑of‑magnitude yield range. Inverse modeling yields comparable accuracy, with R² between 0.967 and 0.999 and 93.7 % of true inputs captured within two sigma. Imputation experiments, where selected scalars or image channels are masked, demonstrate that the model can recover missing information as well as—or slightly better than—the forward surrogate, thanks to strong output‑output correlations.

A detailed sensitivity analysis reveals which diagnostics most influence specific physical parameters. Removing view 3 dramatically degrades prediction of the l = 2, m = 1 symmetry mode, while removing down‑scattered images harms adiabat and P2 swing estimates, highlighting the diagnostic value of each view and channel. This insight can guide future diagnostic design and inform uncertainty quantification when experimental data are incomplete.

Crucially, JointDiff transfers to real NIF experiments without any fine‑tuning. Using only the partial set of scalars and images available from a shot, the model reconstructs the missing observables with high fidelity, and round‑trip predictions (forward then inverse) remain self‑consistent. Discrepancies observed in certain scalars (e.g., systematic under‑prediction of bang‑time) are attributed to limitations of the underlying RP physics model rather than the generative framework itself.

Overall, JointDiff provides a unified, physics‑aware surrogate that can predict forward outcomes, infer hidden inputs, and impute missing diagnostics, all while delivering calibrated uncertainty estimates. Its ability to operate directly on partial multi‑modal data makes it a powerful tool for accelerating ICF design cycles, optimizing diagnostic suites, and deepening our understanding of the complex parameter space governing fusion implosions. Future work may extend the approach to incorporate additional diagnostic modalities, explore active learning for experimental planning, and refine the physics model to further close the simulation‑experiment gap.


Comments & Academic Discussion

Loading comments...

Leave a Comment