Featurized Occupation Measures for Structured Global Search in Numerical Optimal Control
Numerical optimal control is commonly divided between globally structured but dimensionally intractable Hamilton-Jacobi-Bellman (HJB) methods and scalable but local trajectory optimization. We introduce the Featurized Occupation Measure (FOM), a fini…
Authors: Qi Wei, Jianfeng Tao, Haoyang Tan
1 Featurized Occupation Measures for Structured Global Search in Numerical Optimal Control Qi W ei, Jianfeng T ao, Haoyang T an, Hongyu Nie Abstract —Numerical optimal control is commonly divided between globally structured but dimensionally intractable Hamilton-Jacobi-Bellman (HJB) methods and scalable but local trajectory optimization. W e introduce the Featurized Occupation Measure (FOM), a finite-dimensional primal-dual interface for the occupation-measure formulation that unifies trajectory sear ch and global HJB-type certification. FOM is broad yet numerically tractable, covering both explicit weak-f orm schemes and implicit simulator - or rollout-based sampling methods. Within this frame- work, approximate HJB subsolutions serve as intrinsic numerical certificates to directly ev aluate and guide the primal search. W e pro ve asymptotic consistency with the exact infinite-dimensional occupation-measure problem, and show that for block-organized feasible certificates, finite-dimensional approximation preser ves certified lower bounds with blockwise error and complexity control. W e also establish persistence of these lower bounds under time shifts and bounded model perturbations. Consequently , these structural properties render global certificates into flexible, reusable computational objects, establishing a systematic basis for certificate-guided optimization in nonlinear contr ol. Index T erms —Hamilton-Jacobi-Bellman equations, occupation measures, optimal control, sampling methods, trajectory opti- mization. I . I N T RO D U C T I O N Optimal control is central to man y applications in robotics and autonomous systems, where one must make decisions o ver long horizons under nonlinear dynamics, complex constraints, contact or task-induced combinatorial structure [1]–[4], as well as in dynamic-game interactions [5]–[7]. The main challenge is to compute solutions in such highly noncon ve x problems while preserving time-consistency during excution. Numerical optimal control has dev eloped along two main lines. The first seeks a solution of the Hamil- ton–Jacobi–Bellman (HJB) equation [8], [9], whose global value function, when av ailable, induces an optimal or near- optimal feedback law and hence closed-loop control [10]. Its ke y adv antage is the direct maintenance of global value structure, while its main limitation is the curse of dimen- sionality [11]: global state-space discretization typically leads to complexity growing exponentially with dimension. Repre- sentativ e adv ances mitigate this difficulty through structured value representations or equi valent reformulations, including sparse or adaptive representations [12]–[14], low-rank tensor formats [15], [16], and pointwise ev aluation via Hopf formulas or characteristics [17], [18]. This work was supported by the National Key R&D Program of China under Grant 2024YFF0505303. Qi W ei (v7sjtu@sjtu.edu.cn), Jianfeng T ao (corresponding author), Haoyang T an, and Hongyu Nie are with the School of Mechanical Engineering, Shang- hai Jiao T ong University , Shanghai 200240, China (e-mail: jftao@sjtu.edu.cn). The other performs trajectory-wise local optimization in- stead of solving the global HJB equation directly . It includes first-order methods based on the Pontryagin maximum princi- ple [19] (or function-space KKT conditions [20]), and second- order schemes such as DDP [21]–[23], iLQR [24], and SQP [1]. These methods often achie ve high local ef ficiency , but they typically return only a nominal trajectory or a local feedback approximation near it, rather than a globally valid value certifi- cate. Consequently , under strong nonconv exity , disconnected local minima, or information updates, such trajectory objects often hav e limited reusability; in the presence of uncertainty or dynamic-game structure, issues of time consistency become especially acute, as is classical in dynamic game theory [5]. Local optimization, ho wever , need not proceed along a single direction at each step. In difficult nonconv ex problems, it is often more effecti ve to update over a structured set of admissible candidates. Many sample-based MPC and path- integral methods [25]–[28] can be interpreted in this way: each iteration samples, filters, and reweights candidate corrections from a set shaped by a search distribution, constraints, or local models. This raises a broader question. Existing ideas from heuristic search [29], relaxed dynamic programming [30], and admissible-heuristic construction for kinodynamic planning [31] all suggest that globally defined value surrogates can guide local search. This motiv ates a framework that combines trajectory-wise efficienc y with a persistent global value structure. The occupation-measure (OM) formulation provides a canonical language for this issue [32]–[35], with related parallels in information-relaxation approaches to stochastic dynamic programming [36]. It lifts nonlinear optimal control to an infinite-dimensional linear program whose classical dual is an HJB-type inequality problem with HJB subsolutions as dual-feasible v ariables [32], [34], [37]. This yields a unified primal–dual view: occupation measures encode the control process, while feasible HJB subsolutions provide globally valid lower bounds, as already explicit in the LP duality view- point of V inter [32] and in later moment-based dev elopments [34], [37], [38]. Under the OM structure, HJB subsolutions naturally serve to filter, prune, or prioritize primal updates, acting as an intrinsic mechanism rather than an external heuristic [30]–[32], [34], [36], [37], [39]–[41]. The numerical question is then not only how to search trajectories faster , b ut also ho w to maintain a finitely realizable certificate with global HJB semantics, as pursued from dif ferent angles in [30], [31], [35], [36], [40], [41]. W e introduce featurized occupation measures (FOM), a finite-dimensional frame work for retaining OM primal–dual 2 structure in computation. A FOM realization consists of a realized global primal pair , a finite-dimensional certificate class, and a residual mechanism linking them. This yields two canonical realizations: explicit FOM, based on finite weak- form enforcement (with moment-SOS hierarchies [34] as a special case), and implicit FOM, in which primal objects are induced by rollout segments generated by existing ODE solvers or simulators. The significance of FOM is that rollout- and sample-based updates (e.g., MPPI [25]–[28], RL-style policy search [42]– [45]) can be reinterpreted in an OM primal–dual language with an explicit HJB-type certificate. This integrates pruning, guid- ance, warm starts, and cross-algorithm comparison directly into the numerical frame work. V alid approximate certificates remain quantitativ ely useful for these roles; see also [46]. This paper introduces a certificate-first formulation for structured optimal control, in which one maintains not only realized tra- jectories b ut also a globally meaningful certificate that guides primal updates and admits finite-dimensional approximation. FOM provides the corresponding unified framework. Contributions : 1) W e deriv e a common residual-aware certificate interface for FOM. Proposition 2 together with (13)–(14) ev aluates realized primal objects through slack and residual terms, thereby placing explicit and implicit realizations under one OM-based certificate interface. 2) W e prove asymptotic consistency of both realizations with the OM problem: explicit FOM by Theorem 1, and implicit FOM (segmented rollout-based FOM) by Theorem 2. 3) W e establish structural efficienc y and reuse properties of FOM certificates. Theorem 3 and Corollary 1 show that additi ve block-or ganized certificates admit structure- preserving approximations that remain certifying, up to explicit degradation in the lower bound. Propositions 3 and 4, along with Corollary 2 and Corollary 3, show that such approximate certificates remain reusable across time shifts, perturbations, and realizations for pruning, guid- ance, and warm starts, without altering the con vergence target of the realized primal update. The paper is organized as follows. Section II re vie ws the exact occupation-measure primal–dual structure that motiv ates the certificate viewpoint. Section III formulates FOM as a finite-dimensional interface preserving this structure across explicit and implicit realizations. Section IV then dev elops the main computational consequences of this viewpoint, showing how structured certificates support tractable approximation, quantitativ e lower bounds, and reusable updates. I I . P R E L I M I NA R I E S This section recalls the occupation-measure (OM) lifting of Bolza optimal control and the associated HJB–Liouville primal–dual structure used throughout the paper . A. F rom Bolza optimal contr ol to occupation measures Let the admissible control set be U , and the state space be X . Let the support set of the occupation measure be Z := [ t 0 , T ] × X × U . Denote the dual pairing ⟨ ϕ, ρ ⟩ := R ϕ dρ . Consider an admissible state–control pair ( x ( · ) , u ( · )) satis- fying ˙ x ( t ) = f ( t, x ( t ) , u ( t )) , t ∈ [ t 0 , T ] . Its induced occupation measure µ ∈ M + ( Z ) is defined by ⟨ φ, µ ⟩ := Z T t 0 φ t, x ( t ) , u ( t ) dt, ∀ φ ∈ C ( Z ) . (1) Equiv alently , for ev ery Borel set A ⊂ Z , µ ( A ) := Z T t 0 1 A t, x ( t ) , u ( t ) dt, (2) that is, µ is the pushforward of the Lebesgue measure on [ t 0 , T ] under the map t 7→ ( t, x ( t ) , u ( t )) . Let µ 0 ∈ M + ( { t 0 } × X ) and µ T ∈ M + ( { T } × X ) be the initial and terminal boundary measures. For shorthand, write µ 0 ( X ) := µ 0 ( { t 0 } × X ) and µ T ( X ) := µ T ( { T } × X ) for their total masses. In the single deterministic trajectory case, µ 0 = δ ( t 0 ,x 0 ) and µ T = δ ( T ,x ( T )) . W ith these measures, the Bolza cost becomes linear: Z T t 0 ℓ t, x ( t ) , u ( t ) dt + g x ( T ) = ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ . (3) Introduce the space–time transport operator (equiv alently , the Lie deriv ative along the extended vector field (1 , f ) ) L f : C 1 ([ t 0 , T ] × X ) → C ( Z ) , ( L f v )( t, x, u ) := ∂ t v ( t, x ) + ∇ x v ( t, x ) ⊤ f ( t, x, u ) , and its adjoint L ∗ f on measures, defined through ⟨ L f v , µ ⟩ = ⟨ v , L ∗ f µ ⟩ . Then the trajectory dynamics lift to the weak Liouville relation ⟨ v , µ T − µ 0 ⟩ = ⟨ L f v , µ ⟩ , ∀ v ∈ C 1 ([ t 0 , T ] × X ) , (4) or , equiv alently , L ∗ f µ = µ T − µ 0 . Thus, the nonlinear trajectory dynamics are replaced by a linear constraint on nonneg ativ e measures. Under standard as- sumptions ensuring exactness of the deterministic occupation- measure lifting (or, more generally , under relaxed controls in the conv exified setting), the Bolza problem admits an equiv alent or relaxed linear formulation in measure space. When µ 0 = δ ( t 0 ,x 0 ) and µ T = δ ( T ,x ( T )) , the weak Liou- ville identity reduces to the familiar trajectory-wise transport relation v ( T , x ( T )) − v ( t 0 , x 0 ) = Z T t 0 L f v ( t, x ( t ) , u ( t )) dt. (5) B. Primal LP , dual HJB inequality , and global certificates The occupation-measure formulation is the infinite- dimensional linear program P := inf µ ∈M + ( Z ) , µ T ∈M + ( { T }× X ) ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ s . t . L ∗ f µ = µ T − µ 0 . (6) 3 Its dual variable is a global function v ( t, x ) ∈ C 1 ([ t 0 , T ] × X ) , and the dual problem reads P ⋆ := sup v ∈ C 1 ([ t 0 ,T ] × X ) ⟨ v , µ 0 ⟩ s . t . ( L f v )( t, x, u ) + ℓ ( t, x, u ) ≥ 0 , ∀ ( t, x, u ) ∈ Z, v ( T , x ) ≤ g ( x ) , ∀ x ∈ X. (7) The dual feasibility conditions are precisely the HJB inequal- ities. Equiv alently , − ∂ t v ( t, x ) ≤ inf u ∈ U n ℓ ( t, x, u ) + ∇ x v ( t, x ) ⊤ f ( t, x, u ) o , v ( T , x ) ≤ g ( x ) , so any dual-feasible v is a smooth HJB subsolution. Under the standard compactness, continuity , and no-gap assumptions for the deterministic OM LP (as in [34]), one has strong duality , P = P ⋆ . W e do not, howe ver , assume a priori that the optimal dual is attained by a C 1 function, nor do we unconditionally identify an optimal dual element with the value function. The role of the dual problem is more robust: it optimizes ov er smooth subsolutions, and in regular settings where the value function admits a C 1 representativ e satisfying the HJB equation, that representativ e is compatible with an optimal dual certificate. For any primal-feasible ( µ, µ T ) and any dual-feasible v , weak duality yields the certified lower bound ⟨ v , µ 0 ⟩ ≤ ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ , and the corresponding primal–dual gap gap( µ, µ T ; v ) := ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ − ⟨ v , µ 0 ⟩ = ⟨ ℓ + L f v , µ ⟩ + ⟨ g − v ( T , · ) , µ T ⟩ ≥ 0 . (8) Hence dual feasibility is not merely a formal dual constraint: it gives an absolute, globally interpretable lower certificate on the Bolza cost. Remark 1 (From transport probe to certificate) . The ambient dual object in OM is the same smooth scalar field v ∈ V := C 1 ([ t 0 , T ] × X ) . Thr ough the weak Liouville r elation, ⟨ v , µ T − µ 0 ⟩ = ⟨ L f v , µ ⟩ . v acts solely as a test function. In that role , v is only a probe of transport. In particular , when µ 0 = δ ( t 0 ,x 0 ) , the weak identity becomes (5) , which is simply the tr ansport identity induced by the dynamics. The same object acquires a differ ent semantics once the dual inequalities ar e enforced: ℓ + L f v ≥ 0 , g − v ( T , · ) ≥ 0 . Then, along every admissible trajectory , v ( t 0 , x 0 ) ≤ Z T t 0 ℓ ( t, x ( t ) , u ( t )) dt + g ( x ( T )) , so v ( t 0 , x 0 ) becomes a certified global lower bound on the total Bolza cost. C. Appr oximate dual feasibility and relaxed certificate bounds In finite-dimensional FOM realizations, the maintained cer - tificate is often only approximately dual-feasible. For ε, ε T ≥ 0 , we say that v is ( ε, ε T ) -feasible if ( L f v )( t, x, u ) + ℓ ( t, x, u ) ≥ − ε, ∀ ( t, x, u ) ∈ Z , v ( T , x ) ≤ g ( x ) + ε T , ∀ x ∈ X T . (9) Then every primal-feasible pair ( µ, µ T ) satisfying (6) obeys ⟨ v , µ 0 ⟩ − ε µ ( Z ) − ε T µ T ( X T ) ≤ ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ . (10) Thus, approximate dual feasibility still yields a certified lo wer bound, with a degradation quantified explicitly by the violation lev els ( ε, ε T ) and the total masses of ( µ, µ T ) . In the deterministic finite-horizon setting, these masses are explicit. T esting (4) with v ≡ 1 giv es µ T ( X T ) = µ 0 ( X ) , while testing with v ( t, x ) = T − t yields µ ( Z ) = ( T − t 0 ) µ 0 ( X ) . Hence the relaxation term in (10) is completely explicit. In particular , when µ 0 ∈ P ( X ) , one has µ T ( X T ) = 1 , µ ( Z ) = T − t 0 , so the lower -bound degradation reduces to ε ( T − t 0 ) + ε T . Proposition 1 (Approximate certificates provide OM lower bounds) . Assume the deterministic fixed-horizon setting, so that every exact OM-feasible pair ( µ, µ T ) satisfies µ ( Z ) = ( T − t 0 ) µ 0 ( X ) , µ T ( X ) = µ 0 ( X ) . Let v be ( ε, ε T ) -feasible and define P ( v ) := ⟨ v ( t 0 , · ) , µ 0 ⟩ − ( T − t 0 ) µ 0 ( X ) ε − µ 0 ( X ) ε T . Then P ( v ) ≤ P . Pr oof. By (10), ev ery exact OM-feasible pair ( µ, µ T ) satisfies ⟨ v ( t 0 , · ) , µ 0 ⟩ − ε µ ( Z ) − ε T µ T ( X ) ≤ ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ . Using µ ( Z ) = ( T − t 0 ) µ 0 ( X ) , µ T ( X ) = µ 0 ( X ) , this becomes P ( v ) ≤ ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ . T aking the infimum over all exact OM-feasible pairs prov es the claim. Remark 2 (Bounded dual-infeasibility preserves the HJB lower -bound role) . The r ole of ( ε, ε T ) -feasibility is not to ignor e dual infeasibility , but to quantify it in a one-sided Bellman form. Indeed, (9) is equivalent to ( ℓ + ε ) + L f v ≥ 0 , ( g + ε T ) − v ( T , · ) ≥ 0 . 4 Pr oposition 1 shows that this r ole is stable under bounded dual infeasibility: v is an exact HJB subsolution for the perturbed cost pair ( ℓ + ε, g + ε T ) . Thus, bounded dual infeasibility behaves as an exact HJB subsolution for a perturbed cost, yielding a rigorous corr ected OM lower bound with an explicit err or budget. I I I . F E A T U R I Z E D O C C U P AT I O N M E A S U R E S This section studies finite-dimensional realizations of the exact OM LP . Under the certificate-first viewpoint, the goal is to retain the OM primal–dual semantics at finite resolution. W e focus on two such realizations. In an explicit FOM , one parameterizes a global primal pair and represents the Liouville residual through finitely many test functions. This is the default realization at the framework level, though it may be computationally expensiv e and is not always used in practice. In an implicit FOM , one instead uses existing ODE integrators to generate segmentwise or full rollouts, as in multiple or single shooting, and interprets the resulting weak residual through local Dirac-type representations. The resulting finite models are different, but both will be shown asymptotically consistent with the exact OM LP . Concretely , let Z := [ t 0 , T ] × X × U, M := M + ( Z ) × M + ( { T } × X ) , V := C 1 ([ t 0 , T ] × X ) . W e now fix notation and formalize this common structure. A. General FOM components W e do not identify a featurized occupation measure (FOM) with a particular weak discretization of the Liouville equation. Instead, a FOM is any finite-dimensional realization that (i) produces a global primal pair, (ii) keeps a finite-capacity certificate class explicit, and (iii) equips the realized primal pair with a residual mechanism measuring departure from exact Liouville feasibility . Formally , we fix a finite-capacity trial family with capacity index M and parameterization θ on the primal side, and a finite-dimensional certificate family with parameterization ψ on the dual side. Definition 1 (FOM framework) . A gener al FOM r ealization consists of the following objects. F irst, a trial realization family Θ ( M ) , whose elements θ are finite-dimensional parameters. In deter- ministic optimal contr ol, typical examples include an open- loop contr ol u θ : [ t 0 , T ] → U and a feedback law κ θ : [ t 0 , T ] × X → U . More generally , θ need only parameterize a realizable contr ol mechanism that induces a global primal pair . Second, an induced global primal pair θ 7− → ( µ θ , µ T ,θ ) ∈ M , wher e µ θ plays the r ole of a realized occupation-type measur e and µ T ,θ the role of the realized terminal measure . The r ealized cost is then evaluated by J ( θ ) := ⟨ ℓ, µ θ ⟩ + ⟨ g , µ T ,θ ⟩ . Thir d, a parameterized finite-capacity certificate class V Ψ = { v ψ : ψ ∈ Ψ } ⊂ C 1 ([ t 0 , T ] × X ) , which pr eserves the dual side of the OM formulation at finite dimension. F ourth, for each realized primal pair ( µ θ , µ T ,θ ) , we as- sociate its Liouville residual (or primal-feasibility r esidual) functional R θ ( v ) := ⟨ v, µ T ,θ − µ 0 ⟩ − ⟨ L f v , µ θ ⟩ , v ∈ C 1 ([ t 0 , T ] × X ) . (11) This is the canonical weak r esidual attached to the r ealized primal pair . Exact Liouville feasibility is equivalent to R θ ( v ) = 0 ∀ v ∈ C 1 ([ t 0 , T ] × X ) . Here θ need only generate a global primal pair . The weak residual R θ is an abstract object, and its form depends on the realization. B. Certificate evaluation and r esidual-aware bounds In the FOM setting, we ev aluate the realized trial pairs ( µ θ , µ T ,θ ) , θ ∈ Θ ( M ) , which may fail exact Liouville feasibility . The additional ob- ject needed for this passage is precisely the realization residual R θ ( v ψ ) . This is also the point where the structural FOM architecture becomes an optimization interface: the certificate can now be assessed against a realized primal pair through a common residual-aware gap decomposition. For a certificate v ψ , define the running and terminal slacks s ψ ( t, x, u ) := ℓ ( t, x, u ) + ( L f v ψ )( t, x, u ) , s T ,ψ ( x ) := g ( x ) − v ψ ( T , x ) . (12) By (9), the certificate v ψ is ( ε, ε T ) -feasible iff s ψ ≥ − ε, s T ,ψ ≥ − ε T . For a realized pair ( µ θ , µ T ,θ ) , define the residual-corrected interface lower bound J ( θ , ψ ) := ⟨ v ψ , µ 0 ⟩ − ε µ θ ( Z ) − ε T µ T ,θ ( X ) − |R θ ( v ψ ) | and the corresponding certificate-relative gap Gap( θ , ψ ) := J ( θ ) − J ( θ, ψ ) . Proposition 2 (Certificate ev aluation identity and realiza- tion-relativ e gap) . Let v ψ be ( ε, ε T ) -feasible. Then, for every trial parameter θ fr om any finite-dimensional r ealization class Θ ( M ) , J ( θ ) = ⟨ v ψ , µ 0 ⟩ + ⟨ s ψ , µ θ ⟩ + ⟨ s T ,ψ , µ T ,θ ⟩ + R θ ( v ψ ) , (13) hence J ( θ , ψ ) ≤ J ( θ ) , 5 and Gap( θ , ψ ) = ⟨ s ψ + ε, µ θ ⟩ + ⟨ s T ,ψ + ε T , µ T ,θ ⟩ + 2[ R θ ( v ψ )] + ≥ 0 , (14) wher e [ a ] + := max { a, 0 } . Pr oof. By definition, J ( θ ) = ⟨ ℓ, µ θ ⟩ + ⟨ g , µ T ,θ ⟩ . Adding and subtracting ⟨ L f v ψ , µ θ ⟩ and ⟨ v ψ ( T , · ) , µ T ,θ ⟩ , and using the definition of R θ ( v ψ ) , gives (13). Since v ψ is ( ε, ε T ) - feasible, ⟨ s ψ , µ θ ⟩ ≥ − ε µ θ ( Z ) , ⟨ s T ,ψ , µ T ,θ ⟩ ≥ − ε T µ T ,θ ( X ) . Substituting these bounds into (13) yields J ( θ , ψ ) ≤ J ( θ ) . Subtracting J ( θ, ψ ) from (13) giv es Gap( θ , ψ ) = ⟨ s ψ + ε, µ θ ⟩ + ⟨ s T ,ψ + ε T , µ T ,θ ⟩ + R θ ( v ψ ) + |R θ ( v ψ ) | , which is exactly (14) because | a | + a = 2[ a ] + . Equation (13) is the common certificate interface used throughout the rest of the section. It ev aluates any realized pri- mal pair through four terms: the certificate value at the initial measure, the running slack accumulated against the realized occupation measure, the terminal slack accumulated against the realized terminal measure, and the realization residual. Under exact OM primal feasibility , the residual term vanishes, and one recovers the exact OM lower -bound relations from Section II. A way from exact Liouville feasibility , by contrast, the raw realized objective J ( θ ) no longer automatically carries exact OM upper-bound meaning. The gap decomposition in (14) naturally motiv ates a saddle- point optimization scheme: at fix ed θ , one updates the certifi- cate by reducing the weighted slack terms against the current realized primal pair, while maintaining approximate feasibility: min ψ ⟨ s ψ , µ θ ⟩ + ⟨ s T ,ψ , µ T ,θ ⟩ s.t. s ψ ( t, x, u ) ≥ − ε, ∀ ( t, x, u ) ∈ [ t 0 , T ] × X × U, s T ,ψ ( x ) ≥ − ε T , ∀ x ∈ X T . (15) When R θ ( v ψ ) = 0 , this coincides with reducing the certificate-relativ e gap terms in (13). More generally , (15) should be read as the slack side of the certificate update, while the residual contrib ution is realization-dependent and must be enforced, estimated, or controlled by the chosen realization mechanism. The same certificate also induces an admissibility structure on the primal search space. Define the shifted running slack b s ψ ( t, x, u ) := s ψ ( t, x, u ) + ε. For each state–time pair ( t, x ) and tolerance τ ≥ 0 , this yields the local gap-aware admissible action set U τ v ψ ( t, x ) := u ∈ U : b s ψ ( t, x, u ) ≤ inf ¯ u ∈ U b s ψ ( t, x, ¯ u ) + τ . Now let a θ ( t, x ) ∈ U denote the pointwise control action read out from the primal parameter θ ; this may stand for u θ ( t ) in an open-loop parameterization or κ θ ( t, x ) in a feedback parameterization. Given any chosen finite decision context Ξ ⊂ [ t 0 , T ] × X , for instance a rollout batch, a collocation set, or another realization-dependent sample family , the certificate induces the admissible parameter set A τ v ψ (Ξ) := n θ ∈ Θ ( M ) : a θ ( t, x ) ∈ U τ v ψ ( t, x ) ∀ ( t, x ) ∈ Ξ o . (16) Thus the global certificate does not merely ev aluate a realized primal pair after the fact; it also prunes the primal search region by only local b s ψ informations. On the primal side, one then seeks to reduce realized cost while controlling departure from exact Liouville feasibility . When the realization exposes the full residual functional R θ , a scalar residual monitor can be induced from it through a realization-dependent scalarization map, for instance r M ( θ ) := Γ M ( R θ ) . In implementations where only such a monitor is numerically av ailable, one may take r M ( θ ) itself as the primitiv e residual- control quantity . This leads to the certificate-pruned residual-aware primal update θ + ∈ arg min θ ∈A τ v ψ (Ξ) n J ( θ ) + λ r M ( θ ) o , λ ≥ 0 . (17) When τ = + ∞ , the certificate pruning disappears and (17) reduces to the unconstrained residual-aware primal update. Equiv alently , one may use a constrained residual formulation on A τ v ψ (Ξ) by imposing r M ( θ ) ≤ δ for a prescribed tolerance δ ≥ 0 . Thus the certificate is not only ev aluative. Through the shifted running slack b s ψ , it induces admissible local actions and, via the readout map θ 7→ a θ ( t, x ) , admissible param- eter updates. In this way , horizon-wide do wnstream cost is organized through a single global certificate rather than only through realized primal cost. Remark 3 (Admissibility versus local update geometry) . Objects other than global certificates can also r estrict updates befor e con verg ence. In second-order trajectory methods, the local quadratic surr ogate and its r egularization define ac- ceptable parameter directions thr ough Hessian geometry . In sampling-based methods, the searc h covariance analogously r estricts exploration to an ellipsoidal re gion. These r oles are r elated but not equivalent: only the certificate is dual-feasible, carries global Bellman lower-bound semantics, and enters an explicit decomposition of the certified gap. The next two subsections show how the same interface is instantiated differently in the explicit and implicit realizations, through different mechanisms for representing and controlling the residual term R θ ( v ψ ) . C. Explicit FOM and its asymptotic consistency W e first consider the realization in which primal feasibil- ity is represented explicitly through a finite weak Liouville 6 test family . This is the canonical Eulerian form of FOM: the transport constraints remain visible at finite resolution, rather than being absorbed into rollout. It separates two different approximation layers—finite trial expressivity on the primal side, and weak-form refinement on the transport side— while keeping the certificate class independent as a finite- dimensional realization of HJB-type dual objects. In this setting, it is not conv enient to enforce the full Liouville residual functional R θ ( · ) directly . Instead, primal feasibility is monitored through finitely many weak tests. Fix a finite test family V m = span { v 1 , . . . , v m } ⊂ C 1 ([ t 0 , T ] × X ) , and a finite-capacity trial family Θ ( M ) . For each θ ∈ Θ ( M ) , let ( µ θ , µ T ,θ ) ∈ M denote the induced global primal pair, with realized cost J ( θ ) := ⟨ ℓ, µ θ ⟩ + ⟨ g , µ T ,θ ⟩ . Relativ e to the chosen test basis, the explicit residual vector is defined by r ( m ) θ := ⟨ v j , µ T ,θ − µ 0 ⟩ − ⟨ L f v j , µ θ ⟩ m j =1 ∈ R m , (18) and its size is measured by a norm ∥ · ∥ ∗ ,m on R m . This yields the explicit restricted feasible family F exp m,M ( η ) := n θ ∈ Θ ( M ) : ∥ r ( m ) θ ∥ ∗ ,m ≤ η o , and the associated explicit restricted value P exp m,M ( η ) := inf θ ∈F exp m,M ( η ) J ( θ ) . On the exact measure side, the corresponding exact m - truncated feasible set is F m := n ( µ, µ T ) ∈ M : ⟨ v , µ T − µ 0 ⟩ = ⟨ L f v , µ ⟩ , ∀ v ∈ V m o , with truncated value P m := inf ( µ,µ T ) ∈F m n ⟨ ℓ, µ ⟩ + ⟨ g , µ T ⟩ o . This explicit realization separates two approximation layers. The first is a realization-dependent r epr esentation appr oxima- tion : for fixed m , the finite trial family Θ ( M ) approximates the truncated feasible set F m as M → ∞ . The second is a weak-form refinement : as the test family V m is enriched, the truncated weak Liouville formulation recov ers the exact OM LP as m → ∞ . The test family V m and the certificate class V Ψ play distinct roles. The test family appears here only to represent explicit primal residuals. The certificate class remains an independent finite-capacity realization of the dual side, used to represent exact or approximate HJB-type certificates. These two classes may coincide in particular implementations, but they need not be identified. Example 1 (A deterministic optimal-control realization of explicit FOM) . Consider a deterministic optimal-contr ol pr ob- lem with initial pr obability density ρ 0 , and parameterize the contr ol by a time–state feedback kernel π θ ( · | t, x ) ∈ P ( U ) , Once θ is fixed, this induces the averaged drift field b θ ( t, x ) := Z U f ( t, x, u ) π θ ( du | t, x ) . The induced distribution density evolves by a Liouville equa- tion ∂ t ρ θ + ∇ x · b θ ρ θ = 0 , ρ θ ( t 0 , · ) = ρ 0 . The realized cost can then be written as J ( θ ) = Z T t 0 Z X Z U ℓ ( t, x, u ) π θ ( du | t, x ) ρ θ ( t, x ) dx dt + Z X g ( x ) ρ θ ( T , x ) dx. T o obtain a finite-capacity explicit realization, choose a finite test space T m = span { τ 1 , . . . , τ m } . Since the weak Liouville residual is linear in the test function, it is enough to enfor ce it on the chosen basis. This gives the e xplicit r esidual components with j = 1 , . . . , m , r ( m ) θ,j = Z X τ j ( T , x ) ρ θ ( T , x ) dx − Z X τ j ( t 0 , x ) ρ 0 ( x ) dx − Z T t 0 Z X Z U ( L f τ j )( t, x, u ) π θ ( du | t, x ) ρ θ ( t, x ) dx dt collected as r ( m ) θ ∈ R m . At a fixed implementation level m , the primal step may ther efore be written in the penalized form min θ ∈ Θ ( M ) n J ( θ ) + λ ∥ r ( m ) θ ∥ ∗ ,m o . On the dual side, choose a finite certificate family V ( L ) Ψ = { v ψ : ψ ∈ Ψ ( L ) } , with slacks defined by (12) . W ith θ fixed, and with the Liouville defect alr eady exposed on the primal side, (15) motivates the dual update min ψ Z T t 0 Z X Z U s ψ ( t, x, u ) π θ ( du | t, x ) ρ θ ( t, x ) dx dt + Z X s T ,ψ ( x ) ρ θ ( T , x ) dx s.t. s ψ ( t, x, u ) ≥ − ε, s T ,ψ ( x ) ≥ − ε T . on a chosen coarse space–time–contr ol discretization. Remark 4 (Moment–SOS as a dual-feasible explicit FOM) . Moment–SOS [34] is a canonical explicit FOM r ealization: primal feasibility is enforced only thr ough a finite weak Liouville truncation, wher eas dual feasibility is built in by design through r estricting certificates to polynomial HJB-type subsolutions with Putinar positivity repr esentations (SOS). Hence, at fixed r elaxation or der , both primal positivity and dual certificate nonne gativity ar e encoded by semidefinite cone constraints, yielding a primal–dual conic pr ogr am rather than the mor e gener al saddle-type update mechanism in (15) – (17) . This str onger enfor cement, however , also r estricts applicability to problems admitting polynomial and SOS conic r epr esenta- tions. W e no w state the consistency result for the e xplicit realiza- tion. 7 Theorem 1 (Asymptotic consistency of explicit FOM) . Let P denote the optimal value of the exact OM LP , and let P m be the truncated value defined above. Assume that the r elevant supports ar e compact, so that the sets of induced measures are weak- ⋆ r elatively compact, and that the linear functionals µ 7→ ⟨ ℓ, µ ⟩ and µ T 7→ ⟨ g , µ T ⟩ ar e weak- ⋆ continuous on the rele vant feasible sets. Assume moreover that for every fixed m ≥ 1 , the e xplicit trial family is asymptotically dense in F m , in the sense that for every ( µ, µ T ) ∈ F m ther e exists a sequence θ M ∈ Θ ( M ) such that ( µ θ M , µ T ,θ M ) ⇀ ⋆ ( µ, µ T ) and ∥ r ( m ) θ M ∥ ∗ ,m → 0 . Then, for every fixed m , lim η ↓ 0 lim M →∞ P exp m,M ( η ) = P m . If, in addition, the test families ar e asymptotically dense in C 1 ([ t 0 , T ] × X ) in the sense that [ m ≥ 1 V m C 1 = C 1 ([ t 0 , T ] × X ) , and if the standar d weak-form truncation is consistent so that P m → P as m → ∞ , then lim m →∞ lim η ↓ 0 lim M →∞ P exp m,M ( η ) = P. Pr oof. Fix m and write P m,M ( η ) := P exp m,M ( η ) . W e first prove lim η ↓ 0 lim M →∞ P m,M ( η ) = P m by match- ing upper and lower bounds. For the upper bound, fix ε > 0 and choose ( µ ε , µ ε T ) ∈ F m such that ⟨ ℓ, µ ε ⟩ + ⟨ g , µ ε T ⟩ ≤ P m + ε. By asymptotic density of the explicit trial family , there exists a sequence θ ε M ∈ Θ ( M ) such that ( µ θ ε M , µ T ,θ ε M ) ⇀ ⋆ ( µ ε , µ ε T ) , ∥ r ( m ) θ ε M ∥ ∗ ,m → 0 . Now fix η > 0 . Since ∥ r ( m ) θ ε M ∥ ∗ ,m → 0 , there exists M 1 ( η , ε ) such that ∥ r ( m ) θ ε M ∥ ∗ ,m ≤ η ∀ M ≥ M 1 ( η , ε ) . Since the cost functional is weak- ⋆ continuous on the rele v ant feasible set, J ( θ ε M ) → ⟨ ℓ, µ ε ⟩ + ⟨ g , µ ε T ⟩ , so there exists M 2 ( ε ) such that J ( θ ε M ) ≤ ⟨ ℓ, µ ε ⟩ + ⟨ g , µ ε T ⟩ + ε ≤ P m + 2 ε ∀ M ≥ M 2 ( ε ) . Hence, for e very M ≥ max { M 1 ( η , ε ) , M 2 ( ε ) } , the parameter θ ε M is feasible for P m,M ( η ) and satisfies P m,M ( η ) ≤ J ( θ ε M ) ≤ P m + 2 ε. Therefore lim sup M →∞ P m,M ( η ) ≤ P m + 2 ε ∀ η > 0 . T aking lim sup η ↓ 0 and then letting ε ↓ 0 yields lim sup η ↓ 0 lim sup M →∞ P m,M ( η ) ≤ P m . For the lower bound, set L m := lim inf η ↓ 0 lim inf M →∞ P m,M ( η ) . Choose a sequence η n ↓ 0 such that a n := lim inf M →∞ P m,M ( η n ) → L m . For each n , choose M n ≥ n such that P m,M n ( η n ) ≤ a n + 1 n . By definition of the infimum, choose θ n ∈ Θ ( M n ) satisfying ∥ r ( m ) θ n ∥ ∗ ,m ≤ η n , J ( θ n ) ≤ P m,M n ( η n ) + 1 n ≤ a n + 2 n . It follows that lim sup n →∞ J ( θ n ) ≤ L m . By weak- ⋆ relativ e compactness, after passing to a subse- quence we may assume ( µ θ n , µ T ,θ n ) ⇀ ⋆ ( ¯ µ, ¯ µ T ) . Because ∥ r ( m ) θ n ∥ ∗ ,m → 0 in the finite-dimensional space R m , each component of r ( m ) θ n con verges to zero. Thus, for every basis element v j of V m , ⟨ v j , µ T ,θ n − µ 0 ⟩ − ⟨ L f v j , µ θ n ⟩ → 0 . Passing to the limit by weak- ⋆ continuity of the pairings gives ⟨ v j , ¯ µ T − µ 0 ⟩ = ⟨ L f v j , ¯ µ ⟩ , j = 1 , . . . , m. By linearity , the same holds for every v ∈ V m , so ( ¯ µ, ¯ µ T ) ∈ F m . Consequently , P m ≤ ⟨ ℓ, ¯ µ ⟩ + ⟨ g , ¯ µ T ⟩ . On the other hand, weak- ⋆ continuity of the cost gives ⟨ ℓ, ¯ µ ⟩ + ⟨ g , ¯ µ T ⟩ = lim n →∞ J ( θ n ) ≤ lim sup n →∞ J ( θ n ) ≤ L m . Hence P m ≤ L m , i.e. P m ≤ lim inf η ↓ 0 lim inf M →∞ P m,M ( η ) . Combining the upper and lower bounds yields lim η ↓ 0 lim M →∞ P m,M ( η ) = P m . The second claim now follo ws immediately by taking m → ∞ and using the assumed truncation consistency P m → P . Theorem 1 shows that conv ergence relies on two decoupled approximation layers: the density of the trial family at a fixed weak semantics ( m ), and the subsequent refinement of the test family ( m → ∞ ). ” Whether the fixed- m density hypothesis holds is realization-dependent and depends on the expressi ve richness of the chosen primal family . This factorization will contrast with the implicit rollout- based realization below , which does not approximate the truncated sets F m but instead targets exact feasibility through vanishing aggreg ated rollout residuals and interface defects. 8 D. Implicit FOM and its asymptotic consistency W e next consider the rollout-native realization of FOM. Instead of representing primal feasibility by finitely many weak Liouville tests, this realization generates local dynamics directly on each time segment and measures global incon- sistency through interface defects and local rollout residuals. This is the natural setting for practical trajectory optimization, including single shooting, multiple shooting, and more general segmented simulation-based schemes. W e now formalize this implicit realization and state its asymptotic consistency result. Fix a time-domain partition Π := { τ 0 = t 0 < τ 1 < · · · < τ K = T } . For each finite capacity lev el M , let Θ ( M ) Π denote a finite-dimensional segmented-rollout family . That is, each parameter θ specifies a segmentwise simulation-based realization, including the local rollout objects on each segment and their inter-se gment coupling data. For every parameter θ ∈ Θ ( M ) Π , the realization produces, on each segment [ τ k , τ k +1 ] , a local occupation-type measure µ θ,k ∈ M + ([ τ k , τ k +1 ] × X × U ) , k = 0 , . . . , K − 1 , together with endpoint measures ν + θ,k , ν − θ,k +1 ∈ M + ( X ) , representing the segmentwise initial and terminal states in deterministic realizations, or more generally the corresponding endpoint distributions in empirical rollout realizations. The first segment is initialized from the prescribed initial measure, so under the boundary-slice identification we regard µ 0 as its X -marginal and write ν + θ, 0 = µ 0 . The induced global primal pair is then µ θ := K − 1 X k =0 µ θ,k , µ T ,θ := δ T ⊗ ν − θ,K , which embeds the terminal X -measure back into the boundary component of M via the boundary-slice identification. The realized cost is J ( θ ) := ⟨ ℓ, µ θ ⟩ + ⟨ g , µ T ,θ ⟩ . For the implicit realization, the residual mechanism relies on two sources of inconsistency . The first is the interface defect . For each interior node τ k , define d θ,k := ν − θ,k − ν + θ,k , k = 1 , . . . , K − 1 . This measures the defects between the terminal measure of segment k − 1 and the initial measure of segment k . If all interface defects v anish, the segmentwise realizations can be concatenated into a globally consistent object. The second is the local r ollout r esidual . For each v ∈ C 1 ([ t 0 , T ] × X ) , define e θ,k ( v ) := ⟨ v ( τ k +1 , · ) , ν − θ,k +1 ⟩ − ⟨ v ( τ k , · ) , ν + θ,k ⟩ − ⟨ L f v , µ θ,k ⟩ , k = 0 , . . . , K − 1 . This is the weak Liouville defect of the local rollout on the k th segment. For an exact segmentwise rollout, one has e θ,k ( v ) = 0 for ev ery v . For a numerical ODE integrator , this term measures the local weak error of the induced segmentwise realization. These two objects are suf ficient to recover the global weak residual. Indeed, defining R θ ( v ) := ⟨ v, µ T ,θ − µ 0 ⟩ − ⟨ L f v , µ θ ⟩ , one has the following decomposition. Lemma 1 (Global residual decomposition for segmented rollout) . F or every θ ∈ Θ ( M ) Π and every v ∈ C 1 ([ t 0 , T ] × X ) , R θ ( v ) = K − 1 X k =0 e θ,k ( v ) + K − 1 X k =1 ⟨ v ( τ k , · ) , d θ,k ⟩ . Pr oof. By definition, ⟨ L f v , µ θ ⟩ = K − 1 X k =0 ⟨ L f v , µ θ,k ⟩ . Substituting the definition of e θ,k ( v ) giv es ⟨ L f v , µ θ,k ⟩ = ⟨ v ( τ k +1 , · ) , ν − θ,k +1 ⟩ − ⟨ v ( τ k , · ) , ν + θ,k ⟩ − e θ,k ( v ) . Summing over k and rearranging yields ⟨ L f v , µ θ ⟩ = ⟨ v , µ T ,θ ⟩ − ⟨ v ( t 0 , · ) , ν + θ, 0 ⟩ − K − 1 X k =1 ⟨ v ( τ k , · ) , d θ,k ⟩ − K − 1 X k =0 e θ,k ( v ) . Since ν + θ, 0 = µ 0 , this is exactly the desired identity . Lemma 1 identifies the exact mechanism by which seg- mented rollout recov ers the global weak residual: not through a finite test family , but through the accumulation of local rollout residuals and interface defects. This immediately yields the e xact special cases of practical interest. If each segment is generated by exact rollout, then e θ,k ( v ) ≡ 0 ∀ v , ∀ k . If, in addition, all interface defects vanish, then R θ ( v ) = 0 ∀ v ∈ C 1 ([ t 0 , T ] × X ) , hence the induced global pair is exactly OM-feasible. T o formulate a restricted optimization problem, fix a defect norm ∥ · ∥ def on signed endpoint measures such that, for every ϕ ∈ C ( X ) , the pairing σ 7→ ⟨ ϕ, σ ⟩ is continuous. Denote its operator norm by ∥ ϕ ∥ def , ∗ := sup ∥ σ ∥ def ≤ 1 |⟨ ϕ, σ ⟩| . Define the aggregate interface defect D θ := K − 1 X k =1 ∥ d θ,k ∥ def . 9 Like wise, define the aggregate rollout residual size by E θ := sup ∥ v ∥ C 1 ≤ 1 K − 1 X k =0 e θ,k ( v ) . The segmented-rollout restricted feasible family is then F imp Π ,M ( η , δ ) := n θ ∈ Θ ( M ) Π : D θ ≤ η , E θ ≤ δ o , and the associated restricted value is P imp Π ,M ( η , δ ) := inf θ ∈F imp Π ,M ( η ,δ ) J ( θ ) . Accordingly , implicit consistency is organized directly around the decay of aggregated rollout residuals and interface defects, rather than approximation of a truncated weak feasible set F m . The continuity requirement on the cost is the same as in the explicit case. What changes is the compactness and density mechanism. W e say that the segmented-rollout family is asymptotically compact if ev ery sequence M n → ∞ , θ n ∈ Θ ( M n ) Π satisfying sup n J ( θ n ) < ∞ , D θ n → 0 , E θ n → 0 , admits a weak- ⋆ con vergent subsequence of induced primal pairs ( µ θ n , µ T ,θ n ) ⇀ ⋆ ( ¯ µ, ¯ µ T ) . W e say that the segmented-rollout family is asymptotically dense in the exact OM-feasible set if for every exact OM- feasible pair ( µ, µ T ) there exists a sequence θ M ∈ Θ ( M ) Π such that ( µ θ M , µ T ,θ M ) ⇀ ⋆ ( µ, µ T ) , D θ M → 0 , E θ M → 0 . Example 2 (A deterministic optimal-control realization of implicit FOM) . Here we consider an implicit realization with K = 1 . Let Ξ be a contr ol-r ealization space, where each ξ ∈ Ξ determines an admissible control signal u ξ ( · ) , and let q θ ∈ P (Ξ) be a contr ol-side measur e depending on finitely many param- eters θ . Given an initial measur e µ 0 , the deterministic ODE dynamics pushes forward ( x 0 , ξ ) ∼ µ 0 ⊗ q θ to trajectories x x 0 ,ξ ( · ) , ther eby inducing a primal pair ( µ θ , µ T ,θ ) and the cost J ( θ ) = ⟨ ℓ, µ θ ⟩ + ⟨ g , µ T ,θ ⟩ = E x 0 ∼ µ 0 ξ ∼ q θ " Z T t 0 ℓ t, x x 0 ,ξ ( t ) , u ξ ( t ) dt + g x x 0 ,ξ ( T ) # . When q θ collapses to a Dirac mass, this reduces to standard deterministic trajectory optimization. In contrast with the explicit r ealization, the primal side does not intr oduce a finite Liouville test family directly . Instead, one samples ( x 0 , ξ ) , r olls out the induced trajectories, and esti- mates the empirical cost. If the horizon is treated se gmentwise, or if the r ollout uses an approximate inte grator , the resulting implicit residual indicators ar e pr ecisely the r ollout-based defect terms intr oduced in §III-D, such as within-se gment defects and interface mismatches. A corr esponding primal step ther efore takes the form min θ ∈ Θ ( M ) n J ( θ ) + λ E E θ + λ D D θ o , or its empirical analogue computed from sampled r ollouts. The dual side remains the same finite certificate family V ( L ) Ψ as above , with slacks defined by (12) . W ith θ fixed, and R θ monitor ed on the primal side, (15) yields the dual update on primal-induced samples n = 1 , . . . , N , as min ψ 1 N N X n =1 " Z T t 0 s ψ t, x ( n ) ( t ) , u ( n ) ( t ) dt + s T ,ψ x ( n ) ( T ) # s.t. s ψ ≥ − ε, s T ,ψ ≥ − ε T . The next result is therefore parallel to Theorem 1, except that the truncated feasible sets F m are replaced by the ex- act OM-feasible set, and exact feasibility is recovered from vanishing rollout residuals and interface defects. Theorem 2 (Asymptotic consistency of implicit FOM) . Let P denote the optimal value of the exact OM LP , and write P Π ,M ( η , δ ) := P imp Π ,M ( η , δ ) . Assume that the segmented-r ollout families ar e nested, Θ ( M ) Π ⊆ Θ ( M +1) Π ∀ M ≥ 1 , that the cost functionals are weak- ⋆ continuous on the rele vant asymptotically feasible class, and that the se gmented-r ollout family is asymptotically compact and asymptotically dense in the exact OM-feasible set in the sense described above. Then lim δ ↓ 0 lim η ↓ 0 lim M →∞ P imp Π ,M ( η , δ ) = P. Pr oof. For ev ery fixed ( η , δ ) , the restricted feasible sets are nested in M . Hence P Π ,M +1 ( η , δ ) ≤ P Π ,M ( η , δ ) , so the inner limit A ( η , δ ) := lim M →∞ P Π ,M ( η , δ ) exists in ( −∞ , + ∞ ] . The proof follows the same upper/lower -bound template as Theorem 1. The only substantiv e difference is that the lower - bound argument must recover exact OM feasibility from the segmented-rollout residual mechanism. For the upper bound, fix ε > 0 and choose an exact OM- feasible pair ( µ ε , µ ε T ) such that ⟨ ℓ, µ ε ⟩ + ⟨ g , µ ε T ⟩ ≤ P + ε. By asymptotic density , there exists a sequence θ ε M ∈ Θ ( M ) Π such that ( µ θ ε M , µ T ,θ ε M ) ⇀ ⋆ ( µ ε , µ ε T ) , D θ ε M → 0 , E θ ε M → 0 . Fix ( η , δ ) . For all sufficiently large M , one has D θ ε M ≤ η , E θ ε M ≤ δ, J ( θ ε M ) ≤ P + 2 ε. 10 Therefore A ( η , δ ) ≤ P + 2 ε. Letting ε ↓ 0 yields lim sup δ ↓ 0 lim sup η ↓ 0 A ( η , δ ) ≤ P. For the lower bound, set L := lim inf δ ↓ 0 lim inf η ↓ 0 A ( η , δ ) . Repeating the diagonal selection used in Theorem 1, choose sequences δ n ↓ 0 , η n ↓ 0 , M n → ∞ , and parameters θ n ∈ Θ ( M n ) Π such that D θ n ≤ η n , E θ n ≤ δ n , J ( θ n ) ≤ P Π ,M n ( η n , δ n ) + 1 n , and hence lim sup n →∞ J ( θ n ) ≤ L. By asymptotic compactness, after passage to a subsequence, ( µ θ n , µ T ,θ n ) ⇀ ⋆ ( ¯ µ, ¯ µ T ) . W e claim that ( ¯ µ, ¯ µ T ) is exact OM-feasible. Indeed, Lemma 1 gives, for every v ∈ C 1 ([ t 0 , T ] × X ) , R θ n ( v ) = K − 1 X k =0 e θ n ,k ( v ) + K − 1 X k =1 ⟨ v ( τ k , · ) , d θ n ,k ⟩ . Hence |R θ n ( v ) | ≤ E θ n ∥ v ∥ C 1 + C Π ( v ) D θ n , where C Π ( v ) := K − 1 X k =1 ∥ v ( τ k , · ) ∥ def , ∗ < ∞ . Therefore R θ n ( v ) → 0 for every v ∈ C 1 ([ t 0 , T ] × X ) . Passing to the weak- ⋆ limit in the definition of R θ n ( v ) giv es ⟨ v , ¯ µ T − µ 0 ⟩ = ⟨ L f v , ¯ µ ⟩ ∀ v ∈ C 1 ([ t 0 , T ] × X ) , so ( ¯ µ, ¯ µ T ) is exact OM-feasible. Consequently , P ≤ ⟨ ℓ, ¯ µ ⟩ + ⟨ g , ¯ µ T ⟩ . By weak- ⋆ continuity of the cost, ⟨ ℓ, ¯ µ ⟩ + ⟨ g , ¯ µ T ⟩ = lim n →∞ J ( θ n ) ≤ lim sup n →∞ J ( θ n ) ≤ L. Thus P ≤ lim inf δ ↓ 0 lim inf η ↓ 0 lim M →∞ P Π ,M ( η , δ ) . Combining the upper and lower bounds proves lim δ ↓ 0 lim η ↓ 0 lim M →∞ P Π ,M ( η , δ ) = P. Remark 5 (Compact support as a sufficient compactness mechanism) . A convenient sufficient condition for asymptotic compactness is that the rollout-induced occupation and ter- minal measur es remain supported in fixed compact sets and have uniformly bounded total mass. W e isolate asymptotic compactness in Theor em 2 because this is the pr operty actually used in the pr oof; compact support is only one mechanism that guarantees it. A useful special case is exact segmentwise rollout, for which E θ ≡ 0 and the consistency statement reduces to lim η ↓ 0 lim M →∞ P imp Π ,M ( η , 0) = P. Single shooting corresponds to K = 1 , so D θ ≡ 0 as well. These simplifications remove parts of the residual mechanism, but not the density requirement: exact rollout and single shooting do not by themselves imply asymptotic consistency . Both explicit and implicit realizations con verge asymptoti- cally to the exact OM LP , despite utilizing different mecha- nisms to enforce primal feasibility . I V . C O M P U TA T I O N A L C O N S E Q U E N C E S O F F O M A. Structural complexity over state-space complexity The canonical curse of dimensionality in dynamic program- ming comes from representing a global state–time object on a full-state grid. From the FOM vie wpoint, that object is the certificate. Thus the relev ant issue is not the ambient dimension n alone, but whether the certificate admits a low- complexity structural organization, since this is what governs both tractability and online reuse. Such structure is not limited to full separability and is far from rare in practice. In particular, constructiv e nonlinear control for strict-feedback, strict-feedforward, and interlaced systems, especially under suitable cost shaping or in verse- optimal design, often leads to recursive, lo w-dimensional, and hierarchical L yapunov or control-L yapunov certificates, which are more faithfully vie wed as block-organized than fully separable [47], [48]. A complementary source of structure comes from compositional stability theory: small-gain argu- ments yield separable L yapunov or control-L yapunov certifi- cates for interconnected systems [49], while weakly coupled optimal control problems may still admit approximately block- localized value representations through decaying sensitivities [49], [50]. The results below apply precisely when these structural insights can be formalized as an additive certificate organization ov er a block family { S k } K k =1 , allowing ov erlaps and nested blocks. Fix an index family S k ⊂ { 1 , . . . , n } , k = 1 , . . . , K, where overlaps and nested blocks are allowed, and let x S k denote the corresponding coordinate subvector . W e assume a structured certificate class of the form V { S k } := v ∈ C 1 ([ t 0 , T ] × X ) : v ( t, x ) = K X k =1 v k ( t, x S k ) , v k ∈ C 1 ([ t 0 , T ] × X S k ) . For each block S k , define the associated blockwise transport operator A k : C 1 ([ t 0 , T ] × X S k ) → C ( Z ) 11 by ( A k ϕ )( t, x, u ) := ∂ t ϕ ( t, x S k ) + ∇ x S k ϕ ( t, x S k ) ⊤ f S k ( t, x, u ) . W e say that the optimal control problem admits a block- or ganized certificate structure over { S k } K k =1 if there exist functions ℓ k ∈ C ( Z ) , g k ∈ C (( X T ) S k ) , k = 1 , . . . , K , such that ℓ = P K k =1 ℓ k on Z , and g ( x ) = P K k =1 g k ( x S k ) on X T . For ev ery represented certificate v ( t, x ) = K X k =1 v k ( t, x S k ) , v k ∈ C 1 ([ t 0 , T ] × X S k ) , one then has L f v + ℓ = K X k =1 A k v k + ℓ k on Z , (19) and g ( x ) − v ( T , x ) = K X k =1 g k ( x S k ) − v k ( T , x S k ) on X T . (20) This is a strong structural hypothesis, and it is stated as such. Completely decoupled systems are immediate special cases, while layered or triangular systems motiv ate approximate versions of the same idea. Theorem 3 (Global dual feasibility of exact OM can be certified blockwise) . Assume the block-or ganized certificate structur e (19) – (20) . Suppose there exist functions v ⋆ k ∈ C 1 ([ t 0 , T ] × X S k ) , k = 1 , . . . , K, such that A k v ⋆ k + ℓ k ≥ 0 on Z, v ⋆ k ( T , · ) ≤ g k on ( X T ) S k , (21) for every k . Define v ⋆ ( t, x ) := K X k =1 v ⋆ k ( t, x S k ) . Then v ⋆ is dual-feasible for the exact OM dual pr oblem (7) , and hence ⟨ v ⋆ ( t 0 , · ) , µ 0 ⟩ ≤ P . If, in addition, each v ⋆ k is repr esented in a local class of size r k , then v ⋆ is repr esented by a total of r tot := K X k =1 r k coefficients. Pr oof. By (19) and (21), L f v ⋆ + ℓ = K X k =1 A k v ⋆ k + ℓ k ≥ 0 on Z . Like wise, by (20) and (21), g − v ⋆ ( T , · ) = K X k =1 g k − v ⋆ k ( T , · ) ≥ 0 on X T . Hence v ⋆ is dual-feasible, and the lower bound follows by weak duality . The parameter count is additiv e by construction. Theorem 3 shows that global dual feasibility can be certified blockwise whenev er a block-organized exact certificate exists, av oiding a full-state tensor-product representation. W e next relax this to the approximate case. For each block k , let V ( r ) k ⊂ C 1 ([ t 0 , T ] × X S k ) be a nested blockwise realization family . W e assume only a blockwise approximation property measured directly at the lev el of the HJB inequalities. Corollary 1 (Approximate blockwise certificates preserve certified lower bounds) . Under the assumptions of Theor em 3, let v ⋆ ( t, x ) = K X k =1 v ⋆ k ( t, x S k ) be the exact globally feasible certificate assembled there . Suppose that for every η > 0 and every block k ther e exists ˜ v k,η ∈ V ( r k ( η )) k such that ∥ A k ( ˜ v k,η − v ⋆ k ) ∥ ∞ ≤ η , ∥ ˜ v k,η ( T , · ) − v ⋆ k ( T , · ) ∥ ∞ ≤ η . (22) Define ˜ v η ( t, x ) := K X k =1 ˜ v k,η ( t, x S k ) . Then ˜ v η is ( K η , K η ) -feasible in the sense of (9) . Conse- quently , ⟨ ˜ v η ( t 0 , · ) , µ 0 ⟩ − ( T − t 0 ) µ 0 ( X ) K η − µ 0 ( X ) K η ≤ P . Mor eover , ˜ v η is repr esented with r tot ( η ) := K X k =1 r k ( η ) blockwise coefficients. Pr oof. By (19), L f ˜ v η + ℓ = K X k =1 A k ˜ v k,η + ℓ k = K X k =1 A k v ⋆ k + ℓ k + K X k =1 A k ( ˜ v k,η − v ⋆ k ) . The first sum is nonnegati ve by (21), and the second is bounded below by − K η by (22). Hence L f ˜ v η + ℓ ≥ − K η . Similarly , g − ˜ v η ( T , · ) = K X k =1 g k − v ⋆ k ( T , · ) + K X k =1 v ⋆ k ( T , · ) − ˜ v k,η ( T , · ) ≥ − K η . 12 Thus ˜ v η is ( K η , K η ) -feasible. The certified lower bound fol- lows from Proposition 1, and the coefficient count is additive by construction. Theorem 3 and Corollary 1 together show more than the well-known fact that structured representations can ease high- dimensional HJB computation [14], [16], [51]. The key point is that the same separable class can serve both as an exact certificate class, whenev er the true HJB solution admits that structure, and as an approximate certificate class whose finite realizations still preserve a certified lower bound. In this sense, the result identifies a certificate-preserving bridge from exact structured HJB objects to finite structured approximations. For a target local accuracy η , the required coordinates scale as r tot ( η ) = P K k =1 r k ( η ) . For instance, if the chosen blockwise realization families satisfy approximation estimates of the form r k ( η ) ≲ η − β k , k = 1 , . . . , K, for sufficiently small η , then r tot ( η ) ≲ K X k =1 η − β k ≲ K η − β , β := max 1 ≤ k ≤ K β k . Thus, when the exponents β k are controlled by local block complexity rather than by the full ambient dimension n , the dominant approximation burden is paid blockwise rather than globally . This is the intended sense in which certificate-first computation can mitigate the canonical curse of dimensional- ity: it does not remov e hardness in the worst case, but relocates the approximation burden to the structural complexity of the certificate class. Example 3 (A hierarchical certificate for a strict-feedback system) . Consider the autonomous strict-feedback system ˙ ξ = f ( ξ ) + g ( ξ ) η , ˙ η = u, with a stabilization objective. Suppose a backstepping design yields a stabilizing feedback law together with a Lyapunov function, and that, by a suitable in verse-optimal construction, this Lyapunov function takes the form V ( ξ , η ) = v 1 ( ξ ) + 1 2 η − α ( ξ ) 2 . Let v 2 ( ξ , η ) = 1 2 η − α ( ξ ) 2 and take S 1 = { ξ } and S 2 = { ξ , η } . W ith the stationary transport operator (Lie derivative) L f v = ∂ ξ v f ( ξ ) + g ( ξ ) η + ∂ η v u, the corresponding blockwise contrib utions are A 1 v 1 = v ′ 1 ( ξ ) f ( ξ ) + g ( ξ ) η , A 2 v 2 = ∂ ξ v 2 ( ξ , η ) f ( ξ ) + g ( ξ ) η + ∂ η v 2 ( ξ , η ) u. Hence L f V = A 1 v 1 + A 2 v 2 . If the running cost admits a compatible split ℓ ( ξ , η , u ) = ℓ 1 ( ξ , η , u ) + ℓ 2 ( ξ , η , u ) , and the blockwise inequalities A 1 v 1 + ℓ 1 ≥ 0 , A 2 v 2 + ℓ 2 ≥ 0 hold, then L f V + ℓ = ( A 1 v 1 + ℓ 1 ) + ( A 2 v 2 + ℓ 2 ) ≥ 0 . Thus the global certificate condition is verified blockwise, even though the system is coupled and the certificate is not additively separable in the original coordinates. This does not mean that such systems are automatically easy , but that the dominant complexity is determined by the av ailable certificate architecture. Whene ver a compatible blockwise or hierarchical certificate representation exists, the FOM burden follows that structure rather than an unstructured full-state grid. This again supports the certificate-first vie w- point. B. Certificates ar e reusable online objects T rajectory reuse giv es only an initialization. After a horizon shift, a terminal-cost change, or a mild model update, a pre vi- ously optimal trajectory may become infeasible or may simply lose its meaning as a guide for the new problem. Certificates behav e differently because they are global inequalities rather than branch objects. This subsection records two elementary in variance properties that make that distinction precise. Write the horizon length as H := T − t 0 . Proposition 3 (T ime-shift in v ariance of approximate certifi- cates) . Assume that the pr oblem data are time-homogeneous, so that f = f ( x, u ) and ℓ = ℓ ( x, u ) are independent of time, and that the terminal cost is the same function g on every horizon window of length H . Let v ∈ C 1 ([0 , H ] × X ) be ( ε, ε T ) -feasible on [0 , H ] , i.e., ∂ t v ( t, x ) + ∇ x v ( t, x ) ⊤ f ( x, u ) + ℓ ( x, u ) ≥ − ε, and v ( H, x ) ≤ g ( x ) + ε T . F or any shift τ ≥ 0 , define v τ ( t, x ) := v ( t − τ , x ) , ( t, x ) ∈ [ τ , τ + H ] × X. Then v τ is ( ε, ε T ) -feasible for the shifted-horizon problem on [ τ , τ + H ] . Pr oof. Since f and ℓ are time-independent, ∂ t v τ ( t, x ) + ∇ x v τ ( t, x ) ⊤ f ( x, u ) + ℓ ( x, u ) = ∂ t v ( t − τ , x ) + ∇ x v ( t − τ , x ) ⊤ f ( x, u ) + ℓ ( x, u ) ≥ − ε. At the shifted terminal time, v τ ( τ + H, x ) = v ( H, x ) ≤ g ( x ) + ε T . Hence v τ is ( ε, ε T ) -feasible on the shifted window . The exact shift statement already explains why certificates are natural warm-start objects in receding-horizon settings. The more realistic question is what happens under small drift in the model or the cost. Here the answer is again fav orable because approximate feasibility is stable under perturbation. 13 Proposition 4 (Perturbation stability of approximate certifi- cates) . Consider two fixed-horizon pr oblems on the same do- main with data ( f , ℓ, g ) and ( ˜ f , ˜ ℓ, ˜ g ) . Let v ∈ C 1 ([ t 0 , T ] × X ) be ( ε, ε T ) -feasible for ( f , ℓ, g ) . Assume ∥ ˜ ℓ − ℓ ∥ ∞ ≤ δ ℓ , ∥ ˜ f − f ∥ ∞ ≤ δ f , ∥ ˜ g − g ∥ ∞ ≤ δ g , and let G v := ∥∇ x v ∥ ∞ . Then v is ε + δ ℓ + G v δ f , ε T + δ g -feasible for the perturbed pr oblem ( ˜ f , ˜ ℓ, ˜ g ) . Pr oof. By direct expansion, ∂ t v + ∇ x v ⊤ ˜ f + ˜ ℓ = ∂ t v + ∇ x v ⊤ f + ℓ + ∇ x v ⊤ ( ˜ f − f ) + ( ˜ ℓ − ℓ ) . Since v is ( ε, ε T ) -feasible for ( f , ℓ, g ) , ∂ t v + ∇ x v ⊤ f + ℓ ≥ − ε. Moreov er , ∇ x v ⊤ ( ˜ f − f ) ≥ −∥∇ x v ∥ ∞ ∥ ˜ f − f ∥ ∞ ≥ − G v δ f , and ˜ ℓ − ℓ ≥ − δ ℓ . Therefore ∂ t v + ∇ x v ⊤ ˜ f + ˜ ℓ ≥ − ε + δ ℓ + G v δ f . At terminal time, v ( T , x ) ≤ g ( x ) + ε T ≤ ˜ g ( x ) + ε T + δ g . Hence the perturbed feasibility claim follows. Corollary 2 (Certificate warm starts for online optimization) . In r eceding-horizon or online settings, a pr eviously computed certificate remains a valid warm-start object after horizon shifts and small model or cost changes. More pr ecisely , after applying Pr opositions 3 and 4, the same certificate continues to pr ovide a certified lower bound through (10) , with de gra- dation explicit in the shift or perturbation budget. Pr oof. Combine Propositions 3 and 4 with (10). Unlike trajectory w arm-starts, which carry no built-in lo wer- bound guarantee and may become infeasible or sub-optimal under environmental perturbations, global certificates provide a stable lo wer bound that degrades smoothly according to the slack terms. Remark 6 (Certificates as robust warm-start objects) . A pre- viously r ealized trajectory is often a brittle warm-start object: small changes in the initial condition or pr oblem geometry may induce a discontinuous switch of the optimal trajectory . By contrast, an approximate global certificate is a pr oblem- level inequality object, and can ther efor e be reused dir ectly to initialize the next dual update for a new pr oblem with similar structur e. Consequently , a small continuous update of the certificate may still induce a nonsmooth change in the r esulting trajectory . F igure 1 illustrates this effect for a constant-speed unicycle in a planar static obstacle-avoidance pr oblem: F igur e 1 (b) is obtained fr om Figur e 1 (a) by a warm-started dual update after a structural obstacle change . The heat maps show the spatial component v 1 ( x, y ) of a decomposed appr oximate global certificate, evaluated on a planar mesh. (a) Before obstacle change. (b) After obstacle change and a approximate certificate-guided search. Fig. 1: Certificate-guided replanning under a structural obsta- cle change. C. Global certificate as an interface acr oss FOM r ealizations The certified interface pro vides a direct computational con- sequence of Section III. Once a method outputs an induced global primal pair together with a certificate candidate, the quantities P ( ψ ) , J ( θ, ψ ) , are already defined. Hence different realizations can be judged on the same scale even when their internal parameterizations are unrelated. Corollary 3 (Certified comparability across realizations) . Sup- pose two algorithms, possibly based on differ ent r ealizations, output pairs ( θ a , ψ a ) , ( θ b , ψ b ) . Then the following comparisons ar e well-posed and r ealization-independent: P ( ψ a ) vs. P ( ψ b ) , J ( θ a , ψ a ) vs. J ( θ b , ψ b ) . In particular , algorithmic pr ogr ess need not be judged by r ollout cost alone; it can be judged thr ough a common certified lower-side interface and its induced gap information. Pr oof. The lower comparison P ( ψ a ) vs. P ( ψ b ) is realization-independent by Proposition 1. The realized lower interface comparison J ( θ a , ψ a ) vs. J ( θ b , ψ b ) is realization-independent by Proposition 2, which depends only on the induced primal pair , the certificate, and the residual term already defined in Section III. Corollary 3 shows that realizations with unrelated internal parameterizations are still comparable once they expose a certificate and residual information. Different realizations may 14 generate their primal pairs in very dif ferent ways, but once they expose a certificate and the associated residual information, they become directly comparable through the same lo wer-side interface and induced gap information. This is what makes cross-realization warm starts and hybrid algorithm design conceptually clean. Remark 7 (Offline proposal measures for certificate synthe- sis) . Offline r ollout distrib utions pr oduced by external solvers may still be useful as proposal measures for tar get-specific certificate updates. What transfers is not the primal solution itself, but sampling support for the dual update. In summary , the FOM framew ork integrates local trajectory search with global HJB certification. By lev eraging block- wise structures and perturbation stability , certificates serve as robust, reusable objects for online optimization and cross- algorithm ev aluation. V . C O N C L U S I O N This paper introduces the Featurized Occupation Measure (FOM), a unified primal-dual framework for nonlinear optimal control. By maintaining a finite-dimensional interface, FOM bridges the fundamental gap between scalable local trajectory search and globally valid Hamilton-Jacobi-Bellman (HJB) certification. W e established that FOM realizations are asymptotically consistent with the exact infinite-dimensional occupation mea- sure problem. This holds whether the primal feasibility is enforced through explicit weak-form tests or implicit rollout- based mechanisms. Under this framew ork, both trajectory exploration and dual certificate updates are ev aluated through a common, residual-aware gap metric. Crucially , the FOM formulation shifts the computational burden from the ambient state dimension to the structural complexity of the certificate class. W e prov ed that block- organized approximations preserve certified lo wer bounds without requiring full-state discretization for a broad class of problem formulations. Furthermore, unlike nominal tra- jectories, approximate global certificates are robust inequality objects. They naturally persist under time shifts and bounded model perturbations, serving as rigorous w arm-starts for online replanning and providing a realization-independent baseline for cross-algorithm ev aluation. Future work will pursue a sharper theory of structured certificates within FOM, including quantitativ e approxima- tion and complexity bounds, con ver gence guarantees for residual-aware primal–dual updates, and broader structural characterizations of problem classes admitting low-complexity global certificates. On the computational side, these questions will be tested through large-scale implementations in high- dimensional robotic systems. A C K N O W L E D G M E N T R E F E R E N C E S [1] J. T . Betts, Practical methods for optimal contr ol and estimation using nonlinear pro gramming . SIAM, 2010. [2] M. Posa, C. Cantu, and R. T edrake, “ A direct method for trajectory optimization of rigid bodies through contact, ” The International Journal of Robotics Researc h , vol. 33, no. 1, pp. 69–81, 2014. [3] R. Deits and R. T edrake, “Footstep planning on uneven terrain with mixed-integer con vex optimization, ” in 2014 IEEE-RAS international confer ence on humanoid robots . IEEE, 2014, pp. 279–286. [4] M. Zhang, D. K. Jha, A. U. Raghunathan, and K. Hauser, “Simultaneous trajectory optimization and contact selection for contact-rich manipula- tion with high-fidelity geometry , ” IEEE T ransactions on Robotics , 2025. [5] T . Bas ¸ar and G. J. Olsder , Dynamic noncooperative game theory . SIAM, 1998. [6] J. Lidard, H. Hu, A. Hancock, Z. Zhang, A. G. Contreras, V . Modi, J. DeCastro, D. Gopinath, G. Rosman, N. E. Leonard et al. , “Blending data-driv en priors in dynamic games, ” arXiv preprint , 2024. [7] J. Li, G. Qu, J. J. Choi, S. Sojoudi, and C. T omlin, “Multi-agent guided policy search for non-cooperative dynamic games, ” arXiv preprint arXiv:2509.24226 , 2025. [8] M. Bardi, I. C. Dolcetta et al. , Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations . Springer, 1997, vol. 12. [9] W . H. Fleming and H. M. Soner, Controlled Markov processes and viscosity solutions . Springer , 2006. [10] D. Bertsekas, Dynamic pr ogramming and optimal control: V olume I . Athena scientific, 2012, vol. 4. [11] R. Bellman, “Dynamic programming, ” science , vol. 153, no. 3731, pp. 34–37, 1966. [12] J. Garcke and A. Kr ¨ oner , “Suboptimal feedback control of pdes by solving hjb equations on adaptive sparse grids, ” Journal of Scientific Computing , vol. 70, no. 1, pp. 1–28, 2017. [13] O. Bokanowski, J. Garcke, M. Griebel, and I. Klompmaker , “ An adap- tiv e sparse grid semi-lagrangian scheme for first order hamilton-jacobi bellman equations, ” Journal of Scientific Computing , vol. 55, no. 3, pp. 575–605, 2013. [14] W . Kang and L. C. Wilcox, “Mitigating the curse of dimensionality: sparse grid characteristics method for optimal feedback control and hjb equations, ” Computational Optimization and Applications , v ol. 68, no. 2, pp. 289–315, 2017. [15] S. Dolgov , D. Kalise, and K. K. Kunisch, “T ensor decomposition meth- ods for high-dimensional hamilton–jacobi–bellman equations, ” SIAM Journal on Scientific Computing , vol. 43, no. 3, pp. A1625–A1650, 2021. [16] M. B. Horowitz, A. Damle, and J. W . Burdick, “Linear hamilton jacobi bellman equations in high dimensions, ” in 53rd IEEE Conference on Decision and Contr ol . IEEE, 2014, pp. 5880–5887. [17] J. Darbon and S. Osher , “ Algorithms for overcoming the curse of dimensionality for certain hamilton–jacobi equations arising in control theory and else where, ” Researc h in the Mathematical Sciences , vol. 3, no. 1, p. 19, 2016. [18] Y . T . Chow , J. Darbon, S. Osher , and W . Yin, “ Algorithm for over- coming the curse of dimensionality for state-dependent hamilton-jacobi equations, ” Journal of Computational Physics , vol. 387, pp. 376–409, 2019. [19] L. S. Pontryagin, The Mathematical Theory of Optimal Pr ocesses . John W iley , 1963. [20] P . Bosch and J. G ´ omez, “ A proof of a local maximum principle for optimal control problems with mixed state constraints, ” Rev Invest Oper Braz , vol. 9, no. 3, pp. 239–262, 2000. [21] D. Mayne, “ A second-order gradient method for determining optimal trajectories of non-linear discrete-time systems, ” International Journal of Control , vol. 3, no. 1, pp. 85–95, 1966. [22] D. H. Jacobson and D. Q. Mayne, “Differential dynamic programming, ” Elsevier Press , 1970. [23] Y . T assa, N. Mansard, and E. T odorov , “Control-limited differential dynamic programming, ” in 2014 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2014, pp. 1168–1175. [24] W . Li and E. T odorov , “Iterative linear quadratic regulator design for nonlinear biological mov ement systems, ” in F irst International Con- fer ence on Informatics in Contr ol, Automation and Robotics , vol. 2. SciT ePress, 2004, pp. 222–229. [25] H. J. Kappen, “Linear theory for control of nonlinear stochastic sys- tems, ” Physical re view letters , vol. 95, no. 20, p. 200201, 2005. [26] E. Theodorou, J. Buchli, and S. Schaal, “ A generalized path integral control approach to reinforcement learning, ” The Journal of Machine Learning Research , vol. 11, pp. 3137–3181, 2010. [27] G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation, ” Journal of Guidance, Contr ol, and Dynamics , vol. 40, no. 2, pp. 344–357, 2017. 15 [28] G. W illiams, P . Drews, B. Goldfain, J. M. Rehg, and E. A. Theodorou, “Information-theoretic model predicti ve control: Theory and applications to autonomous dri ving, ” IEEE T ransactions on Robotics , vol. 34, no. 6, pp. 1603–1622, 2018. [29] E. A. Hansen and S. Zilberstein, “Lao: A heuristic search algorithm that finds solutions with loops, ” Artificial Intelligence , vol. 129, no. 1-2, pp. 35–62, 2001. [30] B. Lincoln and A. Rantzer, “Relaxing dynamic programming, ” IEEE T ransactions on Automatic Control , vol. 51, no. 8, pp. 1249–1260, 2006. [31] B. Paden, V . V arriccho, and E. Frazzoli, “Design of admissible heuristics for kinodynamic motion planning via sum-of-squares programming, ” arXiv preprint arXiv:1609.06277 , 2016. [32] R. V inter, “Con vex duality and nonlinear optimal control, ” SIAM journal on control and optimization , vol. 31, no. 2, pp. 518–538, 1993. [33] J. W arga, Optimal contr ol of differ ential and functional equations . Academic press, 2014. [34] J. B. Lasserre, D. Henrion, C. Prieur, and E. Tr ´ elat, “Nonlinear optimal control via occupation measures and lmi-relaxations, ” SIAM journal on contr ol and optimization , vol. 47, no. 4, pp. 1643–1666, 2008. [35] D. P . De Farias and B. V an Roy , “The linear programming approach to approximate dynamic programming, ” Operations resear ch , vol. 51, no. 6, pp. 850–865, 2003. [36] D. B. Brown and J. E. Smith, “Information relaxations and duality in stochastic dynamic programs: A review and tutorial, ” F oundations and T rends in Optimization , vol. 5, no. 3, pp. 246–339, 2022. [37] A. Kamoutsi, T . Sutter, P . M. Esfahani, and J. Lygeros, “On infinite linear programming and the moment approach to deterministic infinite horizon discounted optimal control problems, ” IEEE contr ol systems letters , vol. 1, no. 1, pp. 134–139, 2017. [38] E. Pauwels, D. Henrion, and J.-B. Lasserre, “Positivity certificates in op- timal control, ” in Geometric and Numerical F oundations of Movements . Springer , 2017, pp. 113–131. [39] D. R. Jiang, L. Al-Kanj, and W . B. Powell, “Optimistic monte carlo tree search with sampled information relaxation dual bounds, ” Operations Resear ch , vol. 68, no. 6, pp. 1678–1697, 2020. [40] A. Oustry and M. T acchi, “Minimal-time nonlinear control via semi- infinite programming, ” arXiv preprint , 2023. [41] F . Holtorf, A. Edelman, and C. Rackauckas, “Stochastic optimal control via local occupation measures, ” arXiv pr eprint arXiv:2211.15652 , 2022. [42] J. Schulman, F . W olski, P . Dhariwal, A. Radford, and O. Klimov , “Prox- imal policy optimization algorithms, ” arXiv pr eprint arXiv:1707.06347 , 2017. [43] D. Silver , G. Lever , N. Heess, T . Degris, D. Wierstra, and M. Riedmiller , “Deterministic policy gradient algorithms, ” in International confer ence on machine learning . Pmlr, 2014, pp. 387–395. [44] P . V arnai and D. V . Dimarogonas, “Path integral policy improvement: An information-geometric optimization approach, ” 2020. [45] S. Cai, Z. Yin, J. Jacob, and F . Ramos, “Q-guided stein variational model predicti ve control via rl-informed policy prior, ” 2026. [Online]. A vailable: https://arxiv .org/abs/2507.06625 [46] F . Holtorf, Bounds and low-rank appr oximation for controlled Markov pr ocesses . Massachusetts Institute of T echnology , 2024. [47] R. Sepulchre, M. Janko vi ´ c, and P . V . Kokoto vi ´ c, Constructive nonlinear contr ol . Springer Science & Business Media, 1997. [48] P . Kok otovi ´ c and M. Arcak, “Constructive nonlinear control: a historical perspective, ” Automatica , vol. 37, no. 5, pp. 637–662, 2001. [Online]. A vailable: https://www .sciencedirect.com/science/article/pii/ S0005109801000024 [49] L. Gr ¨ une and M. Sperl, “Examples for separable control lyapunov functions and their neural network approximation, ” IF AC-P apersOnLine , vol. 56, no. 1, pp. 19–24, 2023, 12th IF A C Symposium on Nonlinear Control Systems NOLCOS 2022. [Online]. A vailable: https://www .sciencedirect.com/science/article/pii/S2405896323001921 [50] M. Sperl, L. Saluzzi, L. Gr ¨ une, and D. Kalise, “Separable approxima- tions of optimal value functions under a decaying sensiti vity assump- tion, ” in 2023 62nd IEEE Confer ence on Decision and Contr ol (CDC) , 2023, pp. 259–264. [51] E. T odorov , “Compositionality of optimal control laws, ” Advances in neural information processing systems , vol. 22, 2009.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment