Sampled-Data Wasserstein Distributionally Robust Control of Multiplicative Systems: A Convex Relaxation with Performance Guarantees
This paper investigates the robust optimal control of sampled-data stochastic systems with multiplicative noise and distributional ambiguity. We consider a class of discrete-time optimal control problems where the controller \emph{jointly} selects a feedback policy and a sampling period to maximize the worst-case expected concave utility of the inter-sample growth factor. Modeling uncertainty via a Wasserstein ambiguity set, we confront the structural obstacle of~``concave-max’’ geometry arising from maximizing a concave utility against an adversarial distribution. Unlike standard convex loss minimization, the dual reformulation here requires a minimax interchange within the semi-infinite constraints, where the utility’s concavity precludes exact strong duality. To address this, we utilize a general minimax inequality to derive a tractable convex relaxation. Our approach yields a rigorous lower bound that functions as a probabilistic performance guarantee. We establish an explicit, non-asymptotic bound on the resulting duality gap, proving that the approximation error is uniformly controlled by the Lipschitz-smoothness of the stage reward and the diameter of the disturbance support. Furthermore, we introduce necessary and sufficient conditions for \emph{robust viability}, ensuring state positivity invariance across the entire ambiguity set. Finally, we bridge the gap between static optimization and dynamic performance, proving that the optimal value of the relaxation serves as a rigorous deterministic floor for the asymptotic average utility rate almost surely. The framework is illustrated on a log-optimal portfolio control problem, which serves as a canonical instance of multiplicative stochastic control.
💡 Research Summary
This paper addresses the robust optimal control of sampled‑data stochastic systems whose dynamics are multiplicative in nature and whose disturbance distribution is only partially known. The decision variables are both the feedback control law and the sampling period, which together determine the inter‑sample growth factor Φₙ(u, x). The performance criterion is the worst‑case expected value of a concave, non‑decreasing utility U applied to Φₙ, for example the logarithmic utility used in Kelly‑type growth‑optimal portfolio problems.
Because the utility is concave in the disturbance, the problem exhibits a “concave‑max” structure: one must maximize a concave function over the control variables while an adversary minimizes over all probability measures within a Wasserstein ball of radius ε around an empirical distribution. Standard distributionally robust optimization (DRO) results rely on convex loss functions and Sion’s minimax theorem to obtain exact dual reformulations; these tools fail here because the inner maximization is not convex.
The authors overcome this obstacle by invoking a general minimax inequality, which yields a tractable convex relaxation that provides a rigorous lower bound on the true optimal value. Under three structural assumptions—compactness of the control and disturbance sets, convexity of the stage reward in the control and concavity in the disturbance, and uniform L‑smoothness of the reward with respect to the disturbance—they derive an explicit, non‑asymptotic bound on the duality gap:
0 ≤ V_opt − V_relax ≤ Lₙ·Dₙ·ε,
where Lₙ is the Lipschitz constant of the gradient of the stage reward, Dₙ is the diameter of the disturbance support, and ε is the Wasserstein radius. Notably, the gap does not depend on ε’s magnitude beyond the linear factor, guaranteeing that the relaxation is not overly conservative even for moderate ambiguity sizes.
A further contribution is the notion of robust viability: the authors define a set of admissible controls U_v(n; η) that ensures Φₙ(u, x) ≥ η for all disturbances x in the support. This condition guarantees that the utility is well‑defined (e.g., log Φₙ stays finite) and that the state remains strictly positive under any distribution in the ambiguity set, which is essential for multiplicative systems.
The paper then bridges static optimization and dynamic performance. Assuming the underlying stochastic process is ergodic, Theorem 3.10 shows that the optimal value of the convex relaxation serves as a deterministic floor for the long‑run average utility rate almost surely:
lim inf_{T→∞} (1/T)∑{k=0}^{T‑1} U(Φ{n_k}(u_k, X_{k,n_k})) ≥ V_relax a.s.
For logarithmic utility this translates into a certified lower bound on the asymptotic capital growth rate, providing a rigorous justification for rolling‑horizon implementations.
Statistical guarantees are also supplied. Lemma 3.9 establishes that, when the empirical distribution is built from N i.i.d. samples, choosing the Wasserstein radius according to standard concentration results yields a confidence level (1‑δ) such that the relaxation’s value is a valid lower confidence bound on the true worst‑case expected utility.
The theoretical developments are illustrated on a log‑optimal portfolio control problem using historical S&P 500 data. The controller jointly selects the rebalancing period (sampling interval) and the portfolio weights. A cutting‑plane algorithm solves the convex relaxation efficiently. Numerical results demonstrate that the proposed method outperforms conventional benchmarks (fixed rebalancing frequency, classical Kelly rule with known distribution) in terms of both downside risk (lower maximum drawdown) and risk‑adjusted return (higher Sharpe ratio). Moreover, the optimal sampling period balances transaction costs against growth, confirming the practical relevance of treating the sampling interval as a decision variable.
In summary, the paper makes four major contributions: (1) a unified sampled‑data DRO formulation for multiplicative systems; (2) a convex relaxation based on a minimax inequality with explicit duality‑gap bounds; (3) robust viability conditions guaranteeing state positivity across all admissible distributions; and (4) ergodic‑theoretic guarantees linking the relaxation’s optimal value to long‑run average performance. The work opens avenues for extensions to higher‑dimensional nonlinear growth factors, partial observability, and adaptive ambiguity‑radius selection in online learning settings.
Comments & Academic Discussion
Loading comments...
Leave a Comment