Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability

Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

$Φ$-equilibria – and the associated notion of $Φ$-regret – are a powerful and flexible framework at the heart of online learning and game theory, whereby enriching the set of deviations $Φ$ begets stronger notions of rationality. Recently, Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC ‘24) – abbreviated as DFFPS – settled the existence of efficient algorithms when $Φ$ contains only linear maps under a general, $d$-dimensional convex constraint set $\mathcal{X}$. In this paper, we significantly extend their work by resolving the case where $Φ$ is $k$-dimensional; degree-$\ell$ polynomials constitute a canonical such example with $k = d^{O(\ell)}$. In particular, positing only oracle access to $\mathcal{X}$, we obtain two main positive results: i) a $\text{poly}(n, d, k, \text{log}(1/ε))$-time algorithm for computing $ε$-approximate $Φ$-equilibria in $n$-player multilinear games, and ii) an efficient online algorithm that incurs average $Φ$-regret at most $ε$ using $\text{poly}(d, k)/ε^2$ rounds. We also show nearly matching lower bounds in the online learning setting, thereby obtaining for the first time a family of deviations that captures the learnability of $Φ$-regret. From a technical standpoint, we extend the framework of DFFPS from linear maps to the more challenging case of maps with polynomial dimension. At the heart of our approach is a polynomial-time algorithm for computing an expected fixed point of any $ϕ: \mathcal{X} \to \mathcal{X}$ based on the ellipsoid against hope (EAH) algorithm of Papadimitriou and Roughgarden (JACM ‘08). In particular, our algorithm for computing $Φ$-equilibria is based on executing EAH in a nested fashion – each step of EAH itself being implemented by invoking a separate call to EAH.


💡 Research Summary

This paper makes significant advances in the computational and online learning aspects of Φ-equilibria, a central framework in game theory and online learning that generalizes solution concepts like correlated equilibrium by considering a set of allowable deviations Φ. The power and rationality of the equilibrium concept strengthen as the set Φ grows.

The authors build upon and substantially extend the recent breakthrough of Daskalakis et al. (DFFPS, STOC ‘24), which settled the efficient computation of Φ-equilibria when Φ contains only linear maps over a general convex set. This work resolves the more challenging and broader case where Φ is k-dimensional. A canonical example is the set of degree-ℓ polynomial maps, which has dimension k = d^O(ℓ). The main results, assuming only oracle access to the constraint set X, are:

  1. Computation: A poly(n, d, k, log(1/ε))-time algorithm for computing ε-approximate Φ-equilibria in n-player multilinear games (which include extensive-form games).
  2. Online Learning: An efficient online algorithm that guarantees average Φ-regret at most ε after poly(d, k)/ε² rounds.

A key technical innovation is the focus on expected fixed points. Instead of finding a point x such that φ(x)=x (a standard fixed point), which is intractable for nonlinear φ, the authors seek a distribution μ such that E_{x∼μ}


Comments & Academic Discussion

Loading comments...

Leave a Comment