Annealed MAP
Maximum a Posteriori assignment (MAP) is the problem of finding the most probable instantiation of a set of variables given the partial evidence on the other variables in a Bayesian network. MAP has been shown to be a NP-hard problem [22], even for constrained networks, such as polytrees [18]. Hence, previous approaches often fail to yield any results for MAP problems in large complex Bayesian networks. To address this problem, we propose AnnealedMAP algorithm, a simulated annealing-based MAP algorithm. The AnnealedMAP algorithm simulates a non-homogeneous Markov chain whose invariant function is a probability density that concentrates itself on the modes of the target density. We tested this algorithm on several real Bayesian networks. The results show that, while maintaining good quality of the MAP solutions, the AnnealedMAP algorithm is also able to solve many problems that are beyond the reach of previous approaches.
💡 Research Summary
The paper addresses the Maximum a Posteriori (MAP) problem in Bayesian networks, which seeks the most probable joint assignment of a set of query variables X given partial evidence E. Unlike the Most Probable Explanation (MPE) problem, MAP involves both maximization (over X) and summation (over the remaining variables Y), making exact inference NP‑hard even for restricted structures such as polytrees. Existing approaches—including genetic algorithms, mini‑bucket approximation, local search, and branch‑and‑bound depth‑first search—either suffer from exponential search spaces, memory blow‑up, or become ineffective on large, densely connected networks.
To overcome these limitations, the authors propose the AnnealedMAP algorithm, a hybrid of Simulated Annealing (SA) and Markov Chain Monte Carlo (MCMC) sampling. The key idea is to construct a non‑homogeneous Markov chain whose stationary distribution is a temperature‑scaled version of the target posterior:
p_i(X | E) ∝ p(X | E)^{1/T_i}
where the temperature T_i is gradually reduced toward zero. At T = 1 the chain reduces to ordinary Gibbs sampling; as T → 0 the distribution concentrates on the modes of p(X | E), i.e., the MAP solutions. The transition kernel selects a single variable x_j from X, keeps all other variables fixed, and samples a new value x*j from the conditional distribution p^{1/T_i}(x_j | x{‑j}, E). The Metropolis‑Hastings acceptance probability simplifies to a ratio of two single‑variable posteriors, allowing the algorithm to operate with only local inference.
Exact inference for the required conditionals is performed using a junction‑tree (clustering) algorithm when feasible; otherwise, approximate inference such as Loopy Belief Propagation is employed. The algorithm begins with a sequential initialization that sets each MAP variable to its most probable state conditioned on the evidence and previously initialized MAP variables, providing a good starting point for the annealing process.
The annealing schedule is geometric cooling: T_{i+1} = α T_i with 0 < α < 1 (typically 0.8–0.99). To avoid premature convergence to local optima, the authors augment cooling with a reheating strategy based on specific heat C_H(T) = σ^2(T)/T^2, where σ^2(T) is the variance of the cost (negative log‑posterior) at temperature T. When the specific heat peaks, the system is reheated to a temperature K·C_b + T(C_max H), where C_b is the current best cost and K is a tunable factor. This dynamic adjustment helps the chain escape shallow basins and explore broader regions of the solution space.
Algorithmic steps:
- Initialize X^(0), set T_0, i = 0.
- While stopping criteria are not met: a. For each MAP variable x_j: i. Sample a candidate x*j from p(x_j | x{‑j}, E). ii. Compute acceptance probability min{1,
Comments & Academic Discussion
Loading comments...
Leave a Comment