Plug-and-Play Diffusion Meets ADMM: Dual-Variable Coupling for Robust Medical Image Reconstruction
Plug-and-Play diffusion prior (PnPDP) frameworks have emerged as a powerful paradigm for solving imaging inverse problems by treating pretrained generative models as modular priors. However, we identify a critical flaw in prevailing PnP solvers (e.g., based on HQS or Proximal Gradient): they function as memoryless operators, updating estimates solely based on instantaneous gradients. This lack of historical tracking inevitably leads to non-vanishing steady-state bias, where the reconstruction fails to strictly satisfy physical measurements under heavy corruption. To resolve this, we propose Dual-Coupled PnP Diffusion, which restores the classical dual variable to provide integral feedback, theoretically guaranteeing asymptotic convergence to the exact data manifold. However, this rigorous geometric coupling introduces a secondary challenge: the accumulated dual residuals exhibit spectrally colored, structured artifacts that violate the Additive White Gaussian Noise (AWGN) assumption of diffusion priors, causing severe hallucinations. To bridge this gap, we introduce Spectral Homogenization (SH), a frequency-domain adaptation mechanism that modulates these structured residuals into statistically compliant pseudo-AWGN inputs. This effectively aligns the solver’s rigorous optimization trajectory with the denoiser’s valid statistical manifold. Extensive experiments on CT and MRI reconstruction demonstrate that our approach resolves the bias-hallucination trade-off, achieving state-of-the-art fidelity with significantly accelerated convergence.
💡 Research Summary
This paper addresses two critical shortcomings of existing Plug‑and‑Play Diffusion Prior (PnP‑DP) methods for medical image reconstruction: (1) a steady‑state bias caused by memory‑less updates that only use the instantaneous data‑fidelity gradient, and (2) severe hallucinations that arise when the dual variable accumulated by ADMM introduces structured, spectrally colored residuals that violate the Additive White Gaussian Noise (AWGN) assumption of pretrained diffusion models.
The authors first identify that most PnP‑DP solvers based on Half‑Quadratic Splitting (HQS) or Proximal Gradient effectively set the scaled dual variable (u) to zero, thereby discarding the integral action that would correct accumulated constraint violations. From a control‑theoretic perspective, such proportional‑only controllers cannot eliminate steady‑state error, especially under heavy measurement corruption or ill‑conditioned forward operators, leading to reconstructions that compromise physical fidelity.
To eliminate this bias, the paper re‑introduces the ADMM dual variable (u) into the iterative loop. The dual update (u^{k+1}=u^{k}+ (x^{k+1}-z^{k+1})) acts as an integrator, guaranteeing asymptotic consensus between the primal variables (x) (data‑consistent estimate) and (z) (prior‑guided estimate). Theoretically, with a positive penalty parameter (\rho) and appropriate step‑size scheduling, the algorithm converges to the exact solution of the original variational problem.
However, feeding the dual‑shifted input (v^{k+1}=x^{k+1}+u^{k}) directly into a diffusion denoiser (D_\sigma) is problematic because (u^{k}) contains structured artifacts (e.g., streaks in CT, coherent aliasing in MRI) that are far from the i.i.d. Gaussian noise distribution assumed during diffusion model training. This distributional mismatch pushes the denoiser into an out‑of‑distribution regime, causing it to interpret the structured residuals as semantic content and produce hallucinated features.
The core contribution is a Spectral Homogenization (SH) module that bridges this statistical gap. SH operates in three stages: (1) Diagnosis – it estimates the power spectral density (PSD) of the residual (r^{k+1}=v^{k+1}-z^{k}) using a smoothed Fourier magnitude; (2) Synthesis – it computes the spectral deficit (\Delta S(\omega)=\max{\epsilon, \sigma^2/H_W - \hat S_r(\omega)}) and generates a complementary noise field (\xi^{k+1}) whose amplitude matches this deficit while its phase is taken from a standard white Gaussian sample, ensuring orthogonality to the structured components; (3) Fusion – it adds (\xi^{k+1}) to the dual‑shifted input, yielding (\tilde v^{k+1}=v^{k+1}+\xi^{k+1}).
Proposition 3.1 proves that, under the assumption of uniformly distributed phases and independence between the residual and the synthetic noise, the expected PSD of the effective noise (n_{\text{eff}}=r+\xi) becomes flat: (\mathbb{E}
Comments & Academic Discussion
Loading comments...
Leave a Comment