Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference
Active inference (AIF) unifies exploration and exploitation by minimizing the Expected Free Energy (EFE), balancing epistemic value (information gain) and pragmatic value (task performance) through a curiosity coefficient. Yet it has been unclear when this balance yields both coherent learning and efficient decision-making: insufficient curiosity can drive myopic exploitation and prevent uncertainty resolution, while excessive curiosity can induce unnecessary exploration and regret. We establish the first theoretical guarantee for EFE-minimizing agents, showing that a single requirement–sufficient curiosity–simultaneously ensures self-consistent learning (Bayesian posterior consistency) and no-regret optimization (bounded cumulative regret). Our analysis characterizes how this mechanism depends on initial uncertainty, identifiability, and objective alignment, thereby connecting AIF to classical Bayesian experimental design and Bayesian optimization within one theoretical framework. We further translate these theories into practical design guidelines for tuning the epistemic-pragmatic trade-off in hybrid learning-optimization problems, validated through real-world experiments.
💡 Research Summary
The paper investigates the theoretical foundations of Active Inference (AIF) when the decision‑making rule is to minimize Expected Free Energy (EFE). EFE decomposes into an epistemic term (expected information gain) and a pragmatic term (expected regret), weighted by a curiosity coefficient βt. The central question is under what conditions minimizing EFE can guarantee both reliable learning—posterior consistency—and efficient decision‑making—bounded cumulative regret.
The authors first formalize the problem setting, contrasting Bayesian Optimization (BO), which focuses on goal‑directed exploitation with limited exploration, and Bayesian Experimental Design (BED), which pursues pure information acquisition. AIF is presented as a unifying framework that simultaneously accounts for both objectives through a single variational objective.
The core contributions are threefold:
- Posterior Consistency (Theorem 5.1). Assuming a discrete latent parameter space S, finite prior entropy, and observational distinguishability (the true parameter generates statistically distinguishable outcomes under the current energy function h_t), the authors prove that if the curiosity coefficient satisfies
\
Comments & Academic Discussion
Loading comments...
Leave a Comment