Learning to Guide Local Search for MPE Inference in Probabilistic Graphical Models

Learning to Guide Local Search for MPE Inference in Probabilistic Graphical Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Most Probable Explanation (MPE) inference in Probabilistic Graphical Models (PGMs) is a fundamental yet computationally challenging problem arising in domains such as diagnosis, planning, and structured prediction. In many practical settings, the graphical model remains fixed while inference must be performed repeatedly for varying evidence patterns. Stochastic Local Search (SLS) algorithms scale to large models but rely on myopic best-improvement rule that prioritizes immediate likelihood gains and often stagnate in poor local optima. Heuristics such as Guided Local Search (GLS+) partially alleviate this limitation by modifying the search landscape, but their guidance cannot be reused effectively across multiple inference queries on the same model. We propose a neural amortization framework for improving local search in this repeated-query regime. Exploiting the fixed graph structure, we train an attention-based network to score local moves by predicting their ability to reduce Hamming distance to a near-optimal solution. Our approach integrates seamlessly with existing local search procedures, using this signal to balance short-term likelihood gains with long-term promise during neighbor selection. We provide theoretical intuition linking distance-reducing move selection to improved convergence behavior, and empirically demonstrate consistent improvements over SLS and GLS+ on challenging high-treewidth benchmarks in the amortized inference setting.


💡 Research Summary

The paper addresses the computationally demanding task of Most Probable Explanation (MPE) inference in probabilistic graphical models (PGMs) when the same model must be queried repeatedly with different evidence. Traditional stochastic local search (SLS) methods rely on a greedy best‑improvement rule that selects the neighbor offering the largest immediate increase in log‑likelihood. While this works well locally, it often becomes trapped in poor local optima, especially in high‑dimensional, high‑treewidth graphs. Guided Local Search (GLS+) adds adaptive penalties to discourage revisiting states, but these penalties are query‑specific and cannot be reused across multiple queries on the same fixed graph, limiting amortization of past experience.

The authors propose an amortized look‑ahead framework that learns to guide neighbor selection using a neural network. The key insight is that, if the optimal assignment x* were known, the most efficient move from any state x would be a 1‑flip that reduces the Hamming distance d_H(x, x*) by one. Although x* is unavailable at inference time, a high‑quality approximate solution (\hat{x}) can be obtained offline using an anytime MPE solver (e.g., D‑A‑O‑OPT or Toulbar2) within a modest time budget. The authors generate training data by (1) sampling random complete assignments, (2) fixing a random subset of variables as evidence, (3) solving the resulting query with the anytime solver to obtain (\hat{x}), (4) running a standard local search for a limited number of steps to collect a diverse set of intermediate states, and (5) labeling each 1‑flip neighbor of a state as positive if it reduces the Hamming distance to (\hat{x}) and negative otherwise. This yields a supervised dataset of (state, neighbor, label) triples that reflect the distribution encountered during actual inference.

A neural scoring model processes the current assignment and each candidate flip through embeddings and a self‑attention mechanism that captures global graph structure. The model outputs a probability (\hat{p}\downarrow(x, x’)) estimating the likelihood that flipping variable (x’) will move the current state closer to the reference solution. At test time, the search combines this learned score with the traditional log‑likelihood gain, either by weighted sum or by ranking solely on (\hat{p}\downarrow). The resulting policy balances short‑term objective improvement with long‑term progress toward the optimum.

Theoretical analysis provides a convergence guarantee. Theorem 1 shows that if, at every non‑terminal step, the probability (\alpha) of selecting a distance‑reducing move exceeds ½, the expected hitting time to the optimum is bounded by (h_0/(2\alpha-1)), where (h_0) is the initial Hamming distance. Thus, as long as the learned scorer ranks distance‑reducing moves above non‑reducing ones often enough (empirically (\alpha\approx0.68)), the search is guaranteed to converge in expected polynomial time.

Empirical evaluation spans ten benchmark PGMs, including high‑treewidth Bayesian networks (Alarm, Barley, Linkage) and large Markov random fields (grid‑64, Munin). The authors compare three configurations: vanilla SLS, GLS+, and the proposed amortized look‑ahead (integrated with both SLS and GLS+). All methods run under identical time and iteration budgets. Results consistently demonstrate that the learned guidance improves final log‑likelihood by 2–5 % and reduces the number of iterations required for convergence by 30–45 %. The benefit is especially pronounced when the evidence ratio is high (i.e., fewer query variables), confirming that the model effectively reuses structural knowledge across queries.

In summary, the paper makes three major contributions: (1) a novel supervised learning formulation for neighbor selection in MPE local search that explicitly incorporates long‑term distance‑reduction objectives, (2) a practical data‑generation pipeline that leverages anytime solvers to obtain high‑quality supervision without requiring exact optima, and (3) theoretical and empirical evidence that the learned look‑ahead policy yields faster and higher‑quality MPE solutions while amortizing computation across repeated queries on a fixed graphical model. The work opens avenues for extending the approach to multi‑flip neighborhoods, hypergraph structures, and reinforcement‑learning‑based dynamic adaptation of the selection probability (\alpha).


Comments & Academic Discussion

Loading comments...

Leave a Comment