Generalized Information Gathering Under Dynamics Uncertainty

Generalized Information Gathering Under Dynamics Uncertainty
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

An agent operating in an unknown dynamical system must learn its dynamics from observations. Active information gathering accelerates this learning, but existing methods derive bespoke costs for specific modeling choices: dynamics models, belief update procedures, observation models, and planners. We present a unifying framework that decouples these choices from the information-gathering cost by explicitly exposing the causal dependencies between parameters, beliefs, and controls. Using this framework, we derive a general information-gathering cost based on Massey’s directed information that assumes only Markov dynamics with additive noise and is otherwise agnostic to modeling choices. We prove that the mutual information cost used in existing literature is a special case of our cost. Then, we leverage our framework to establish an explicit connection between the mutual information cost and information gain in linearized Bayesian estimation, thereby providing theoretical justification for mutual information-based active learning approaches. Finally, we illustrate the practical utility of our framework through experiments spanning linear, nonlinear, and multi-agent systems.


💡 Research Summary

The paper tackles the problem of active information gathering for agents that must learn unknown dynamics parameters while operating in a stochastic Markov system. Existing approaches typically derive bespoke information‑gathering costs (most often mutual information between parameters and observations) that are tightly coupled to specific choices of dynamics models, belief‑update mechanisms, observation models, and planners. This coupling forces researchers to re‑derive cost functions whenever any component changes, limiting scalability and generality.

To overcome this limitation, the authors propose a modular decision‑making framework that makes the causal dependencies among parameters θ, states x, observations o, controls u, and parameter beliefs ϑ explicit (see Figure 1). The framework assumes only that the system evolves according to Markov dynamics with additive Gaussian noise (eq. 1–2) and that observations are deterministic functions of the state (eq. 3). The learning process ℓ (Algorithm 1) propagates the true parameters and states forward, generates observations, and updates the belief using any admissible updater (gradient descent, Kalman filter, particle filter, etc.). Planning is formulated as an optimal‑control problem (eq. 7) that predicts future controls, states, observations, and beliefs using a possibly different belief‑dynamics model ℓ̂.

The central contribution is a general information‑gathering cost based on Massey’s directed information. While mutual information I(θ; o) treats the two random sequences symmetrically, directed information I(ϑ̂ → ô ∥ u) respects the temporal causality: future beliefs cannot influence past observations. Formally, directed information is defined as the difference between ordinary entropy of the observation sequence and its causally conditioned entropy (eq. 11–12). The proposed cost is

 J_info( ô, u, ϑ̂ ) = − I(ϑ̂ → ô ∥ u) = −


Comments & Academic Discussion

Loading comments...

Leave a Comment