Bayesian inference for the automultinomial model with an application to landcover data
Multicategory lattice data arise in a wide variety of disciplines such as image analysis, biology, and forestry. We consider modeling such data with the automultinomial model, which can be viewed as a natural extension of the autologistic model to multicategory responses, or equivalently as an extension of the Potts model that incorporates covariate information into a pure-intercept model. The automultinomial model has the advantage of having a unique parameter that controls the spatial correlation. However, the model’s likelihood involves an intractable normalizing function of the model parameters that poses serious computational problems for likelihood-based inference. We address this difficulty by performing Bayesian inference through the Double-Metropolis Hastings algorithm, and implement diagnostics to assess the convergence to the target posterior distribution. Through simulation studies and an application to land cover data, we find that the automultinomial model is flexible across a wide range of spatial correlations while maintaining a relatively simple specification. For large data sets we find it also has advantages over spatial generalized linear mixed models. To make this model practical for scientists, we provide recommendations for its specification and computational implementation.
💡 Research Summary
This paper introduces the automultinomial model as a natural extension of the autologistic and Potts models for multicategory lattice data, and develops a Bayesian inference framework that overcomes the intractable normalizing constant inherent in such models. The automultinomial model incorporates covariate information through category‑specific regression coefficients while retaining a single spatial dependence parameter β that controls the strength of interaction between neighboring sites. The joint probability mass function is proportional to the exponential of a linear predictor plus β times the count of neighboring pairs sharing the same category, but the normalizing constant Z(Θ,β) requires summation over all possible lattice configurations, rendering direct maximum‑likelihood estimation infeasible.
To perform Bayesian inference, the authors adopt the Double‑Metropolis‑Hastings (DMH) algorithm, an auxiliary‑variable MCMC technique. DMH introduces an auxiliary lattice configuration z and proceeds in two nested steps: an outer Metropolis‑Hastings update proposes new values for the parameters (Θ,β), and an inner Metropolis‑Hastings chain of length m generates a sample of z conditional on the proposed parameters. By using the final state of the inner chain as an approximate draw from the exact auxiliary distribution, the intractable normalizing constants cancel in the acceptance ratio. The choice of the inner‑sampler length m is critical; the authors explore values ranging from 5 to 20 and employ the Approximate Curvature Diagnostic (ACD) of Kang et al. (2024) to assess whether the resulting Markov chain adequately approximates the target posterior distribution.
Two simulation studies are conducted. In a weak‑spatial‑correlation scenario (β≈0.2), the DMH posterior means closely match those obtained from pseudolikelihood estimation, confirming that the algorithm does not introduce bias when the model is well‑behaved. In a strong‑correlation scenario (β≈1.5), the automultinomial model remains stable, whereas the pure‑intercept Potts model suffers from phase‑transition phenomena that lead to poor fit and identifiability issues. The simulations also demonstrate that covariate effects (θ_k) are estimated independently of β, facilitating straightforward interpretation.
The methodology is applied to a large land‑cover dataset covering the Asian continent, comprising over 100,000 lattice cells classified into five land‑cover types (forest, cropland, urban, water, other) and eight environmental covariates (elevation, soil type, climate variables, etc.). Posterior inference reveals meaningful estimates for the category proportions and a positive β, indicating substantial spatial clustering of similar land‑cover types. Predictive performance is evaluated via ten‑fold cross‑validation, yielding an average classification accuracy of 0.84, which exceeds the 0.78 achieved by a spatial generalized linear mixed model (SGLMM) fitted to the same data. Computationally, the SGLMM requires sampling of a high‑dimensional latent Gaussian random field (≈200,000 latent variables), leading to long runtimes and memory bottlenecks, whereas the automultinomial model, with only a few hundred parameters, converges in roughly one‑quarter of the time.
The authors conclude with practical recommendations for model specification and implementation: (1) use a 4‑ or 8‑neighborhood structure appropriate to the spatial resolution; (2) assign weakly informative normal or half‑normal priors to β and multivariate normal priors to the regression coefficients; (3) set the inner‑sampler length m between 10 and 20, adjusting adaptively based on ACD diagnostics; (4) monitor convergence using trace plots, Gelman‑Rubin R̂ statistics, and the ACD. These guidelines make the automultinomial model a viable, computationally efficient alternative for analysts dealing with large, multicategory lattice datasets, offering both interpretability and flexibility that are often lacking in hierarchical spatial models.
Comments & Academic Discussion
Loading comments...
Leave a Comment