A Random Matrix Theory Perspective on the Consistency of Diffusion Models

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Diffusion models trained on different, non-overlapping subsets of a dataset often produce strikingly similar outputs when given the same noise seed. We trace this consistency to a simple linear effect: the shared Gaussian statistics across splits already predict much of the generated images. To formalize this, we develop a random matrix theory (RMT) framework that quantifies how finite datasets shape the expectation and variance of the learned denoiser and sampling map in the linear setting. For expectations, sampling variability acts as a renormalization of the noise level through a self-consistent relation $σ^2 \mapsto κ(σ^2)$, explaining why limited data overshrink low-variance directions and pull samples toward the dataset mean. For fluctuations, our variance formulas reveal three key factors behind cross-split disagreement: \textit{anisotropy} across eigenmodes, \textit{inhomogeneity} across inputs, and overall scaling with dataset size. Extending deterministic-equivalence tools to fractional matrix powers further allows us to analyze entire sampling trajectories. The theory sharply predicts the behavior of linear diffusion models, and we validate its predictions on UNet and DiT architectures in their non-memorization regime, identifying where and how samples deviates across training data split. This provides a principled baseline for reproducibility in diffusion training, linking spectral properties of data to the stability of generative outputs.

💡 Research Summary

This paper investigates a striking phenomenon observed in modern diffusion models: when two models are trained on disjoint subsets of the same dataset, they nevertheless produce almost identical images when fed the same random seed and sampled with a deterministic ODE solver. The authors argue that this “consistency” can already be explained by the shared first‑ and second‑order statistics of the data, i.e., the mean and covariance, and that a simple linear Gaussian denoiser captures the bulk of the effect.

The theoretical contribution is a random matrix theory (RMT) analysis of how finite‑sample fluctuations in the empirical covariance matrix affect both the expected denoiser and its variability across independent training splits. The key technical tool is deterministic equivalence (DE), which allows a random empirical covariance (\hat\Sigma) to be replaced, in the high‑dimensional limit (d,n\to\infty) with (d/n\to\gamma), by a deterministic surrogate involving the population covariance (\Sigma) and a scalar “renormalized noise” function (\kappa(\lambda)). The scalar satisfies a self‑consistent equation
\

A Random Matrix Theory Perspective on the Consistency of Diffusion Models

💡 Research Summary

Comments & Academic Discussion

Leave a Comment