Linear-cost unbiased posterior estimates for crossed effects and matrix factorization models via couplings
We design and analyze unbiased Markov chain Monte Carlo (MCMC) schemes based on couplings of blocked Gibbs samplers (BGSs), whose total computational costs scale linearly with the number of parameters and data points. Our methodology is designed for and applicable to high-dimensional BGS with conditionally independent blocks, which are often encountered in Bayesian modeling. We provide bounds on the expected number of iterations needed for coalescence for Gaussian targets, as well as on the tails of the coalescence times distribution. These imply that practical two-step coupling strategies achieve coalescence times that match the relaxation times of the original BGS scheme up to logarithmic factors. To illustrate the practical relevance of our methodology, we apply it to high-dimensional crossed random effect and probabilistic matrix factorization models, for which we develop a novel BGS scheme with improved convergence speed. Our methodology provides unbiased posterior estimates at linear cost (usually requiring only a few BGS iterations for problems with thousands of parameters), matching state-of-the-art procedures for both frequentist and Bayesian estimation of those models.
💡 Research Summary
The paper develops and analyses unbiased Markov chain Monte Carlo (MCMC) estimators that exploit couplings of blocked Gibbs samplers (BGS) while preserving a linear computational cost in both the number of parameters and data points. The authors focus on high‑dimensional Bayesian models where the parameter vector can be partitioned into conditionally independent blocks—a structure common in crossed random‑effects models, generalized linear mixed models (GLMMs) with crossed effects, and probabilistic matrix factorization (PMF).
The methodological core is a two‑step coupling strategy. First, a maximal coupling is used to align the two chains at the block level, maximizing the probability that the corresponding block updates are identical. Second, because each block’s conditional distribution is Gaussian (or can be approximated as such), a Wasserstein‑2 optimal coupling is employed to jointly sample the two updates, guaranteeing the smallest expected squared distance between the coupled states. This construction yields a coupled kernel (\bar P) that satisfies the standard meeting‑time requirements: the chains meet almost surely and stay together thereafter.
The authors derive rigorous bounds for Gaussian target distributions. They prove that the expected meeting time (E
Comments & Academic Discussion
Loading comments...
Leave a Comment