Split, Skip and Play: Variance-Reduced ProxSkip for Tomography Reconstruction is Extremely Fast

Split, Skip and Play: Variance-Reduced ProxSkip for Tomography Reconstruction is Extremely Fast
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Many modern iterative solvers for large-scale tomographic reconstruction incur two major computational costs per iteration: expensive forward/adjoint projections to update the data fidelity term and costly proximal computations for the regulariser, often done via inner iterations. This paper studies for the first time the application of methods that couple randomised skipping of the proximal with variance-reduced subset-based optimisation of data-fit term, to simultaneously reduce both costs in challenging tomographic reconstruction tasks. We provide a series of experiments using both synthetic and real data, demonstrating striking speed-ups of the order 5x–20x compared to the non-skipped counterparts which have been so far the standard approach for efficiently solving these problems. Our work lays the groundwork for broader adoption of these methods in inverse problems.


💡 Research Summary

The paper addresses the two dominant computational bottlenecks in large‑scale tomographic reconstruction: (i) the cost of forward and adjoint projections required to evaluate the data‑fidelity term, and (ii) the cost of proximal operations for the regulariser, which often involve inner iterative solvers. Classical composite optimisation methods such as ISTA and FISTA evaluate both the full gradient of the data‑fit term and the proximal operator of the regulariser at every iteration, leading to prohibitive runtimes for 3‑D or high‑resolution problems.

To mitigate these costs, the authors combine two recent ideas. The first is ProxSkip, a stochastic skipping scheme that applies the proximal operator only with probability p and uses a control variate h to keep the iterates stable. When p = 1, ProxSkip reduces to standard ISTA; for p < 1 the proximal step is performed less frequently, reducing the per‑iteration workload. The second idea is data splitting: the data‑fidelity term is expressed as a finite sum over N subsets (e.g., groups of projection angles). Stochastic gradients are computed on a single subset, which is cheap but noisy. To control the variance, the authors integrate variance‑reduction (VR) techniques—SAGA, SVRG, and Loopless SVRG—into the ProxSkip framework, yielding a family of algorithms they call ProxSAGASkip, ProxSVRGSkip, and ProxLSVRGSkip. All methods follow a unified template: compute an unbiased stochastic gradient Gₖ, decide whether to perform a proximal update via a Bernoulli trial θₖ ∼ Bernoulli(p), and update the control variate hₖ accordingly.

The experimental evaluation consists of two setups. (1) Real‑world chemical imaging data reconstructed with isotropic total variation (TV) regularisation. The authors vary the skipping probability p ∈ {0.01,0.05,0.1,0.3,0.5} and the number of subsets N ∈ {10,50,100,200,400}. They measure wall‑clock time required to reach a relative error of 10⁻⁵ or to complete a fixed budget of 200 data passes. Results show that ProxSVRGSkip with p = 0.05 and N = 100 attains the target accuracy roughly 3.1 times faster than its non‑skipped counterpart and up to 20 times faster than deterministic ISTA. Increasing N reduces the cost of each stochastic gradient but raises variance; the VR mechanisms keep convergence robust even for large N. (2) Simulated cylindrical foam phantom reconstructed in a plug‑and‑play (PnP) framework using the BM3D denoiser. All algorithms are given a strict 3‑minute CPU budget. Skipped variants (ProxSVRGSkip, ProxLSVRGSkip) achieve PSNR improvements of about 2 dB over non‑skipped versions, delivering higher SSIM within the same time budget.

The authors conclude that ProxSkip‑VR offers a powerful, flexible tool for inverse problems where both forward‑model evaluations and proximal steps are expensive. By decoupling the control of these two costs, practitioners can tune p and N to balance computational effort against stochastic variance, achieving substantial speed‑ups without sacrificing reconstruction quality. Importantly, convergence guarantees hold without requiring strong convexity, making the approach applicable to realistic noisy and non‑convex tomography settings. Future work is suggested in the direction of GPU acceleration, incorporation of non‑linear forward models, and integration with learned regularisers (e.g., deep priors) to further push towards real‑time tomographic imaging.


Comments & Academic Discussion

Loading comments...

Leave a Comment