Efficient Bayesian Inference for Generalized Bradley-Terry Models
The Bradley-Terry model is a popular approach to describe probabilities of the possible outcomes when elements of a set are repeatedly compared with one another in pairs. It has found many applications including animal behaviour, chess ranking and multiclass classification. Numerous extensions of the basic model have also been proposed in the literature including models with ties, multiple comparisons, group comparisons and random graphs. From a computational point of view, Hunter (2004) has proposed efficient iterative MM (minorization-maximization) algorithms to perform maximum likelihood estimation for these generalized Bradley-Terry models whereas Bayesian inference is typically performed using MCMC (Markov chain Monte Carlo) algorithms based on tailored Metropolis-Hastings (M-H) proposals. We show here that these MM\ algorithms can be reinterpreted as special instances of Expectation-Maximization (EM) algorithms associated to suitable sets of latent variables and propose some original extensions. These latent variables allow us to derive simple Gibbs samplers for Bayesian inference. We demonstrate experimentally the efficiency of these algorithms on a variety of applications.
💡 Research Summary
The paper addresses the problem of performing Bayesian inference for the Bradley‑Terry (BT) model and its many extensions, which are widely used for modeling pairwise comparison data in fields such as sports ranking, animal behavior, and multiclass classification. While Hunter (2004) introduced efficient minorization‑maximization (MM) algorithms for maximum‑likelihood (ML) and maximum‑a‑posteriori (MAP) estimation, Bayesian inference has traditionally relied on Metropolis‑Hastings (M‑H) Markov chain Monte Carlo (MCMC) schemes that require carefully crafted proposal distributions and can be slow to converge, especially in high‑dimensional settings.
The authors’ first major contribution is to reinterpret the MM updates as instances of the Expectation‑Maximization (EM) algorithm by introducing a set of latent variables. For the basic BT model, they define for each ordered pair (i, j) a latent variable Z_{ij} that follows a Gamma distribution with shape n_{ij} (the number of comparisons between i and j) and rate λ_i + λ_j, where λ_i denotes the skill parameter of item i. This construction can be viewed as the sum of the minimum arrival times in an exponential race interpretation, and it yields a complete‑data log‑likelihood that is linear in λ and Z. In the EM framework, the E‑step computes the expected value of Z given current λ (which is simply n_{ij}/(λ_i+λ_j)), and the M‑step yields a closed‑form update for each λ_i: \
Comments & Academic Discussion
Loading comments...
Leave a Comment