U-DAVI: Uncertainty-Aware Diffusion-Prior-Based Amortized Variational Inference for Image Reconstruction
Ill-posed imaging inverse problems remain challenging due to the ambiguity in mapping degraded observations to clean images. Diffusion-based generative priors have recently shown promise, but typically rely on computationally intensive iterative sampling or per-instance optimization. Amortized variational inference frameworks address this inefficiency by learning a direct mapping from measurements to posteriors, enabling fast posterior sampling without requiring the optimization of a new posterior for every new set of measurements. However, they still struggle to reconstruct fine details and complex textures. To address this, we extend the amortized framework by injecting spatially adaptive perturbations to measurements during training, guided by uncertainty estimates, to emphasize learning in the most uncertain regions. Experiments on deblurring and super-resolution demonstrate that our method achieves superior or competitive performance to previous diffusion-based approaches, delivering more realistic reconstructions without the computational cost of iterative refinement.
💡 Research Summary
U‑DAVI (Uncertainty‑Aware Diffusion‑Prior‑Based Amortized Variational Inference) tackles two persistent challenges in diffusion‑based image reconstruction: (i) the high computational cost of iterative sampling or per‑instance optimization, and (ii) the difficulty of faithfully restoring fine‑grained details in regions where the measurement provides little information. Building on the DA‑VI framework, which learns a single neural generator Iφ that maps a degraded measurement y and a random Gaussian code z to a posterior sample (\hat{x}_0) in one forward pass, the authors introduce a lightweight, training‑time uncertainty estimator and a curriculum‑style perturbation scheme that together focus the model’s capacity on the most ambiguous pixels.
The uncertainty estimator exploits temporal inconsistency: for each training image a persistent reconstruction memory (\bar{x}) is maintained via an exponential moving average (EMA) of past generator outputs. The per‑pixel uncertainty u(p) is defined as the normalized L1 distance between the current output (\hat{x}_0) (scaled to
Comments & Academic Discussion
Loading comments...
Leave a Comment