A Recursive Theory of Variational State Estimation: The Dynamic Programming Approach

A Recursive Theory of Variational State Estimation: The Dynamic Programming Approach
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this article, variational state estimation is examined from the dynamic programming perspective. This leads to two different value functional recursions depending on whether backward or forward dynamic programming is employed. The result is a theory of variational state estimation that corresponds to the classical theory of Bayesian state estimation. More specifically, in the backward method, the value functional corresponds to a likelihood that is upper bounded by the state likelihood from the Bayesian backward recursion. In the forward method, the value functional corresponds to an unnormalized density that is upper bounded by the unnormalized filtering density. Both methods can be combined to arrive at a variational two-filter formula. Additionally, it is noted that optimal variational filtering is generally of quadratic time-complexity in the sequence length. This motivates the notion of sub-optimal variational filtering, which also lower bounds the evidence but is of linear time-complexity. Another problem is the fact that the value functional recursions are generally intractable. This is briefly discussed and a simple approximation is suggested that retrieves the filter proposed by Courts et. al (2021).The methodology is examined in (i) a jump Gauss-Markov system under a certain factored Markov process approximation, and (ii) in a Gauss-Markov model with log-polynomial likelihoods under a Gauss–Markov constraint on the variational approximation. It is demonstrated that the value functional recursions are tractable in both cases. The resulting estimators are examined in simulation studies and are found to be of adequate quality in comparison to sensible baselines.


💡 Research Summary

This paper develops a recursive theory of variational state estimation by casting the problem into a dynamic‑programming framework. Starting from the standard Bayesian formulation of hidden‑Markov models, the authors introduce a variational family of distributions that factorise in time (forward Markov factorisation) and define the evidence lower bound (ELBO) as a functional of the unnormalised posterior and the variational distribution. By decomposing the ELBO into a sum of a “local” term and a future‑dependent term, they recognise a Bellman‑type optimal‑control problem: the time‑marginal of the variational posterior plays the role of a state, while the transition kernels act as controls.

Two distinct value‑function recursions are derived. The backward (or “variational backward”) value functional βₜ is obtained by maximising the future‑dependent part of the ELBO with respect to the conditional variational kernels qₜ₊₁:T|t. The authors prove that βₜ is upper‑bounded by the Bayesian backward likelihood hₜ₊₁:T|t, establishing a variational analogue of the classic smoothing recursion (Theorem 3, Corollary 2). Conversely, the forward (or “variational forward”) value functional αₜ is obtained by maximising the past‑dependent part of the ELBO with respect to the preceding kernels q₀:t‑1|t. αₜ is shown to be upper‑bounded by the unnormalised Bayesian filtering density \barπ₀:t (Theorem 4, Corollary 3).

By combining the two value functions, the authors recover a variational version of the two‑filter formula: the product αₜβₜ, after normalisation, yields the marginal of the variational posterior that is optimal in relative‑entropy sense (Corollary 5). This mirrors the classical forward‑backward smoothing identity but now holds for any variational family that respects the Markov factorisation.

Complexity analysis reveals that the exact optimal variational filter requires O(T²) operations because each time step must consider the full future (or past) contribution. To obtain a practical algorithm, the paper introduces a sub‑optimal linear‑time variant that either discards the backward value function (using only αₜ) or the forward one (using only βₜ). This linear‑time filter still provides a valid lower bound on the evidence and is therefore guaranteed not to over‑estimate the marginal likelihood.

The authors acknowledge that the value‑function recursions are generally intractable due to high‑dimensional integrals. They propose a simple approximation that treats αₜ as an unnormalised density and applies a Gaussian variational assumption, which exactly reproduces the filter of Courts et al. (2021). This connection demonstrates that the proposed framework subsumes existing assumed‑density methods and provides a principled justification for them.

Two case studies illustrate the theory. In the first, a jump‑Gaussian‑Markov system is approximated by assuming independence between the jump process and the continuous state, leading to tractable forward and backward recursions for each component. In the second, a Gaussian‑Markov model with log‑polynomial observation likelihoods is constrained to a Gaussian variational family; under this constraint both αₜ and βₜ remain Gaussian, yielding closed‑form updates. Simulation experiments on both models compare the optimal variational filter, the linear‑time sub‑optimal filter, and standard Bayesian filters (Kalman, particle). Results show that the optimal variational filter matches Bayesian performance closely, while the linear‑time variant incurs only modest degradation yet runs in real‑time.

In summary, the paper establishes a rigorous bridge between variational inference and classical Bayesian state estimation via dynamic programming. It provides both theoretical insights (value‑function bounds, two‑filter identity) and practical algorithms (optimal O(T²) and linear‑time O(T) filters). The work opens avenues for extending variational filtering to non‑Markov families, high‑dimensional systems with sparsity structures, and joint optimisation of control and estimation in reinforcement‑learning settings.


Comments & Academic Discussion

Loading comments...

Leave a Comment