Incentive Compatibility in Stochastic Dynamic Systems
While the classic Vickrey-Clarke-Groves mechanism ensures incentive compatibility for a static one-shot game, it does not appear to be feasible to construct a dominant truth-telling mechanism for agents that are stochastic dynamic systems. The contri…
Authors: Ke Ma, P. R. Kumar
1 Incenti v e Compatibility in Stochastic Dynamic Systems K e Ma, Member , IEEE, P . R. Kumar , F ellow , IEEE Abstract —The classic V ickrey-Clarke-Gro ves (VCG) mecha- nism ensures incenti ve compatibility , i.e., truth-telling is a domi- nant strategy for all agents, for a static one-shot game. However , it does not appear to be feasible to construct mechanisms that ensure dominance of dynamic truth-telling for agents comprised of general stochastic dynamic systems. The agents’ intertemporal net utilities depend on future controls and payments, and a direct extension of the VCG mechanism does not guarantee incentive compatibility . This paper sho ws that such a stochastic dynamic extension does exist for the special case of Linear - Quadratic-Gaussian (LQG) agents. In fact it achieves subgame perfect dominance of truth telling. This is accomplished through a construction of a sequence of layered payments over time that decouples the intertemporal effect of current bids on future net utilities if system parameters are known and agents are rational. An important motivating example arises in power systems where an Independent System Operator has to ensure balance of generation and consumption at all times, while ensuring social efficiency , i.e., maximization of the sum of the utilities of all agents. It is also necessary to satisfy budget balance and individual rationality . However , in general, even f or static one- shot games, ther e is no mechanism that simultaneous satisfies these requirements while being incentive compatible and socially efficient. F or a power mark et of LQG agents, we sho w that there is a modified “Scaled” VCG (SVCG) mechanism that does satisfy incentive compatibility , social efficiency , budget balance and individual rationality under a certain “market power balance” condition where no agent is too negligible or too dominant. W e further sho w that the SVCG payments conver ge to the Lagrange payments, defined as the payments that correspond to the true price in the absence of strategic considerations, as the number of agents in the market increases. For LQ b ut non-Gaussian agents, optimal social welfare over the class of linear control laws is achieved. Index T erms —V ickrey-Clark e-Gro ves (VCG) mechanism, dy- namic VCG, stochastic dynamic systems, LQG agents, incentive compatibility , budget balance, individual rationality , social wel- fare optimality , Independent System Operator (ISO). I . I N T RO D U C T I O N Mechanism design is the sub-field of game theory that considers how to realize socially optimal solutions to problems in volving multiple self-interested agents, each with a priv ate valuation for each outcome. This valuation is generally repre- sented as utility function, which captures the monetary v alue of each alternati ve. A typical approach is to provide financial incentiv es such as payments to promote truth-telling of utility function parameters by agents. An important example is the Independent System Operator (ISO) problem of electric power This work was supported in part by NSF Contract ECCS-1760554, NSF Science & T echnology Center Grant CCF-0939370, the Power Systems Engineering Research Center (PSERC), and NSF Contract IIS-1636772. Ke Ma is with Pacific Northwest National Laboratory , Richland, W A 99352 and P . R. K umar is with T exas A&M University , College Station, TX 77843- 3259 (e-mail: ke.ma@pnnl.gov , prk.tamu@gmail.com) systems in which the ISO aims to maximize social welfare and maintain balance of generation and consumption while each generator/load has a priv ate utility function. The classic V ickrey-Clarke-Grov es (VCG) mechanism [1], and its generalization, the Groves mechanism [2], play a central role in mechanism design since they ensure incentiv e compatibility , i.e., they ensure that truth-telling of utility func- tions by all agents forms a dominant strategy , as well as social welfare optimality , i.e., the sum of utilities of all agents is maximized. The outcome generated by the Grov es mechanism is stronger than a Nash equilibrium in the sense that it is strate gy-pr oof , meaning truth-telling of utility functions is optimal irrespectiv e of what others are bidding. In fact, Green and Laf font [3] sho w that the Grov es mechanism is the only mechanism that is both efficient and strategy-proof if net utilities are quasi-linear , i.e., linear in the amount of money . While the Gro ves mechanism is applicable to a static one- shot game, it does not work for stochastic dynamic games where agents are giv en the opportunity to bid utility functions and states at each time instant. In a stochastic dynamic system that unfolds ov er time, the agents’ intertemporal net utilities depend on the future controls and payments, and a direct extension of the VCG mechanism does not guarantee incentive compatibility [4]. A fundamental difference between dynamic and static mechanism design is that in the former , an agent can bid an untruthful utility function conditional on its past bids (which need not be truthful) and past allocations (from which it can make an inference about other agents’ utility functions) [5]. For dynamic deterministic systems, by collecting the VCG payments as a lump sum of all the payments over the entire time horizon at the beginning, incenti ve compatibility is still assured. Ho wever , for dynamic stochastic systems, the states are priv ate random v ariables and it is necessary to incentivize agents to bid their states truthfully . It does not appear to be feasible to construct mechanisms that ensure the dominance of dynamic truth-telling for agents comprised of general stochastic dynamic systems. Indeed we conjecture that it is not possible to do so for general stochastic dynamic agents. Our contrib ution, on which we b uild further desirable results as described in the sequel, is to show that for the special case of Linear-Quadratic-Gaussian (LQG) agents, where agents hav e linear state equations, quadratic utility functions and additiv e white Gaussian noise, a dynamic stochastic exten- sion of the VCG mechanism does exist, based on a careful construction of a sequence of layered payments over time. W e propose a modified layered mechanism for payments that decouples the intertemporal effect of current bids on future net utilities, and prove that truth-telling of their dynamic states is 2 a dominant strate gy for e very agent, if system parameters are known and agents are rational. “Rationality” is defined in a dynamic programming fashion: an agent is rational at the last time instant if it adopts a dominant strategy whenev er there exists a unique one; an agent is rational at time t if it adopts a unique dominant strategy , assuming that all agents are rational at all future times. In fact the mechanism achieves subgame perfect dominance of truth-telling. An important example of a problem needing such opti- mal dynamic coordination of stochastic agents arises in the ISO problem of power systems. Renewable energy resources such as solar/wind are stochastic and dynamic in nature, as are consumptions by loads which are influenced by factors such as local temperatures and thermal inertias of facilities. References [6] and [7] provide details on how wind turbines can be modeled as LQG systems. In general, agents may have different approaches to responding to the prices set by the ISO. If each agent acts as a price taker , i.e., it honestly discloses its energy consumption/production at the announced prices, a competitive equilibrium would be reached among agents. Howe ver , if agents are price anticipators , then it is critical for the ISO to design a market mechanism that is strategy-proof (i.e., incentive compatible). The challenge for the ISO is to determine a bidding scheme between agents (producers and consumers) and the ISO that maximizes social welfare, while taking into account the stochastic dynamic models of agents. Currently , the ISO solicits bids from generators and Load Serving Entities (LSEs) and operates tw o markets: a day-ahead market and a real-time market. The day-ahead mark et lets market participants commit to buy or sell wholesale electricity one day before the operating day , to satisfy energy demand bids and to ensure adequate scheduling of resources to meet the next day’ s anticipated load. The real-time market lets market participants buy and sell wholesale electricity during the course of the operating day to balance the differences between the already pledged day-ahead commitments and the actual real-time demand and production [8]. Our layered VCG mechanism fits perfectly in the real-time market since at each time instant the market clears and settles, as each agent bids its random state after realization. Howe ver , there is also a potential fatal do wnside for the VCG mechanism: in general, the sum of total payments collected by the ISO is non-positiv e, i.e., there is a budg et deficit . In fact, when agents hav e quadratic utility functions, the total payments collected from consumers is indeed not enough to cov er the total payments to the suppliers. In effect, in order to force agents to rev eal their true utility functions, the ISO will need to subsidize the market. In this paper we will also propose a solution to this problem. The VCG payment charges each agent i the difference between social welfare of others if agent i is absent and social welfare of others when agent i is present. W e exhibit a solution that ensures there is no budget deficit. It consists of inflating the first term abov e in all of the agents’ VCG payments by a constant factor c , leading to a Scaled VCG (SVCG) mechanism. There are howe ver two additional issues to be addressed when proposing such a scheme. The first concerns the issue of individual r ationality . The net utility that an agent obtains is the utility of energy consumption minus the amount it pays. The magnitude of the scaling factor c is important because an agent may opt out of the process if c is chosen to be too large. The reason lies in the fact that not joining the market results in a net utility of zero, obtained from generating/consuming no power and collecting/making no payments, while joining the market results in a negati ve net utility . That is, the scheme is not individually rational. The second issue concerns whether the payment is Lagrange optimal for each agent. In power markets, electricity prices are calculated as the sum of all Lagrange multipliers associated with different constraints, and payments are collected as the product of price and power quantity . By Lagrange optimality is meant the price that would manifest and the payments that would occur if all agents were truth tellers. The concern is that if a customer participates, the price it pays/receiv es need not be Lagrange optimal. W e show that under a “market power balance” condition, which essentially requires that no agent is too negligible or too powerful, there is indeed a systematic way to choose the scaling factor number c such that there is no budget deficit for the ISO, while at the same time guaranteeing that producers and consumers will acti vely participate in the mark et. The factor c can be chosen in a way such that the maximum distortion between the VCG payment and Lagrange payment is minimized. W e argue that based on historic knowledge of the market, the ISO may be able to choose such a c that does not depend on any agent’ s tactical announcement. Moreov er , we show that asymptotically , as the number of agents increases, the Scaled VCG payments con verge to the Lagrange optimal payments. This result provides an economic justification for Load Serving Entities, which aggregate a large number of small consumers into one reasonable sized con- sumer , as being required to achiev e social welfare optimality . In Section II, a survey of related works is presented. Section III provides a description of the classic VCG framework for the static and dynamic deterministic problems. It also intro- duces the Scaled VCG mechanism for individual rationality and budget balance. Section IV presents the difficulties in designing a mechanism that ensures dominance of truth telling when agents are general stochastic dynamic systems. Section V presents the layered mechanism for LQG systems. It shows how intertemporal decoupling is achie ved, and proves subgame perfect dominance of truth telling. It also shows that the budget balance and individual rationality properties of the scaling mechanism carry over from the deterministic case. A numerical example is presented. Section VI shows ho w the mechanism also applies when the noises are not Gaussian, to achiev e optimality in the class of linear feedback strategies. Section VII concludes the paper . I I . R E L A T E D W O R K S In recent years, sev eral works have explored issues arising in dynamic mechanism design. In order to achieve ex-post incentiv e compatibility , Bergemann and V alimaki [9] propose a generalization of the VCG mechanism based on the marginal contribution of each agent and show that ex-post participation constraints are satisfied under some conditions. Athey and 3 Segal [10] consider an extension of the d’Aspremont-Gerard- V aret (A GV) mechanism [11] to design a budget balanced dynamic incenti ve compatible mechanism. Pav an et al. [12] deriv es first-order conditions under which incenti ve compati- bility is guaranteed by generalizing Mirrlees’ s [13] env elope formula of static mechanisms. Cavallo et al. [14] considers a dynamic Markovian model and derives a sequence of Groves- like payments which achiev es Markov perfect equilibrium. Bapna and W eber [15] solves a sequential allocation problem by formulating it as a multi-armed bandit problem. Parkes and Singh [16] and Friedman and Park es [17] consider an en vironment with randomly arriving and departing agents and propose a “delayed” VCG mechanism to guarantee interim incentiv e compatibility . Besanko et al. [18] and Battaglini et al. [19] characterize the optimal infinite-horizon mechanism for an agent modeled as a Markov process, with Besanko con- sidering a linear AR(1) process ov er a continuum of states, and Battaglini focusing on a two-state Markov chain. Ber gemann and Pav an [20] have an excellent surve y on recent research in dynamic mechanism design. A more recent survey paper by Bergemann and V alimaki [5] further discusses the dynamic mechanism design with risk-averse agents and the relationship between dynamic mechanism and optimal contracts. In order to capture strategic interactions between the ISO and market participants, game theory and mechanism design has been proposed in many recent papers. Sessa et al. [21] studies the VCG mechanism for electricity markets and de- riv es conditions to ensure collusion and shill bidding are not profitable. Okajima et al. [22] propose a VCG-based mech- anism that guarantees incentiv e compatibility and individual rationality for day-ahead market with equality and inequality constraints. Xu et al. [23] shows that the VCG mechanism always results in higher per-unit electricity prices than the lo- cational marginal price (LMP) mechanism under any giv en set of reported supply curves, and that the difference between the per-unit prices resulting from the tw o mechanisms is negligibly small. Bistarelli et al. [24] derives a VCG-based mechanism to driv e users in shifting energy consumption during peak hours. In Samadi et al. [25], it is proposed that utility companies use VCG mechanism to collect pri v ate information of electricity users to optimize the energy consumption schedule. T aylor et al. [26] formulates the regulation pricing policy as an LQR problem and applies the VCG mechanism to induce honest participation. There are also some related works aiming at achie ving budget balance for VCG mechanism. Parkes et al. [27] de- signs a budget-balanced and individual-rati onal mechanism for combinatorial exchanges that sacrifices incentiv e com- patibility . Moulin et al. [28] discusses the trade-off between budget balance and ef ficiency of the mechanism. Cav allo [29] uses domain information regarding agent v aluation spaces to achieve redistribution of much of the required transfer payments back among the agents. Similarly , Thirumulanathan et al. [30] propose a mechanism that is efficient and comes close to budget balance by returning much of the payments back to the agents in the form of rebates. In [31], an en- hanced (Arro w-dAspremont-Gerard-V aret) A GV mechanism is proposed to tackle the problem of b udget balance in demand side management. Karaca et al. [32] proposes core-selecting mechanisms that are coalition-proof and budget balanced, but only with approximate incentiv e compatibility , i.e., the sum of potential profits of each bidder from a unilateral deviation is minimized. T anaka et al. [33] sho ws that payments calculated by the clearing-price mechanism and VCG mechanism are similar when each indi vidual agent’ s market power is neg- ligible. In [34], the authors propose a criterion of approximate strategy-proofness called strategy-proofness in the large (SP- L) for a lar ge market identifying a Lagrange multiplier based mechanism as SP-L. Our work, on the other hand, shows that payments defined in the strategy-proof VCG mechanism con verges to Lagrange payments if the market is suf ficiently large. The problem of ho w to conduct bidding to achie ve social welfare optimality in stochastic dynamic systems is examined in [35]. It assumes that all agents are price-takers. A preliminary announcement of some of these results was presented in the conference paper [36]. The layered payment structure for LQG systems is mentioned there, and incentive compatibility results are presented without proofs. The present paper contains the complete proof of incentiv e compatibility , and further introduces the Scaled VCG mechanism for budget balance, individual rationality , and Lagrange optimality . T o our knowledge, there does not appear to be any result that ensures dominance of dynamic truth-telling for agents comprised of LQG systems, let alone ensuring no budget deficit for the ISO, and individual rationality for all agents. I I I . S TA T I C A N D D Y N A M I C D E T E R M I N I S T I C S Y S T E M S A. Static Deterministic Systems W e begin with the simpler static deterministic case. There are N agents, each having a utility function F i ( u i ) , where u i is the amount of energy produced/consumed by agent i , with u i ≤ 0 for a producer and u i ≥ 0 for a consumer . Let u := ( u 1 , ..., u N ) T , u − i := ( u 1 , ..., u i − 1 , u i +1 , ..., u N ) T . The ISO wishes to maximize the social welfare , P i F i ( u i ) : max u X i F i ( u i ) , (1) subject to X i u i = 0 . (1a) W ithout loss of generality we only include power balance constraint (1a). Other constraints such as DC power flow constraints and capacity constraints can be added and will not change the mechanism structure. The ISO does not know the indi vidual utility functions of the agents. If it asks them to disclose their utility functions, they may lie in order to obtain a better allocation. A solution to this problem of “truth-telling” is provided by the VCG mechanism which asks each agent to bid its utility function. Denote agent i ’ s bid by ˆ F i and let ˆ F := ( ˆ F 1 , . . . , ˆ F n ) . After obtaining the bids, the ISO calculates u ∗ ( ˆ F ) as the optimal solution to: max u X i ˆ F i ( u i ) , subject to (1a) . 4 Each agent is then assigned to produce/consume u ∗ i ( ˆ F ) , ac- cruing a utility F i u ∗ i ˆ F . Follo wing the rule that it had announced a priori before recei ving the bids, the ISO collects a payment p i ( ˆ F ) from agent i , defined as: p i ( ˆ F ) := X j 6 = i ˆ F j ( u ( i ) ) − X j 6 = i ˆ F j ( u ∗ ( ˆ F )) , where u ( i ) is defined as the optimal solution to: max u − i X j 6 = i ˆ F j ( u j ) , subject to X j 6 = i u j = 0 . The VCG mechanism is a special case of the Groves mecha- nism [2], where payment p i is defined as: p i ( ˆ F ) := h i ( ˆ F − i ) − X j 6 = i ˆ F j ( u ∗ ( ˆ F )) , where h i is any arbitrary function of ˆ F − i := ( ˆ F 1 , .., ˆ F i − 1 , ˆ F i +1 , ..., ˆ F N ) . Define the “net utility” of an agent as the utility derived by it minus its payment. Truth-telling is a dominant strate gy in the Groves mechanism [2], i.e., each agent maximizes its net utility by bidding its true utility function, regardless of what other agents bid: F i ( u ∗ ( ¯ F )) − p i ( ¯ F ) ≥ F i ( u ∗ ( ˆ F )) − p i ( ˆ F ) , for all ˆ F i , where ¯ F := ( ˆ F 1 , ˆ F 2 , . . . , ˆ F i − 1 , F i , ˆ F i +1 , . . . , ˆ F N ) . Definition 1. A mechanism is incentive compatible (IC) if truth-telling is a dominant strategy for ev ery agent. One should note that an agent may not necessarily tell the truth ev en if truth-telling is dominant since there may be another strategy that is also dominant. W e will assume that ev ery agent is “rational, ” in that if its dominant strategy is unique , then an agent will indeed tell the truth. Definition 2. W e call a mechanism efficient (EF) if the result- ing allocation u ∗ maximizes the social welfare P i F i ( u i ) . T wo more important properties are sought in a solution. Definition 3. A mechanism is ex-post individually rational (IR) if agents are guaranteed to gain a nonnegati ve net utility by participating, that is, F i ( u ∗ i ) − p i ≥ 0 (because, by abstaining, an agent can always realize a net utility of zero). Definition 4. A mechanism satisfies budget balance (BB) if the total payment made by agents is nonnegati ve: P i p i ≥ 0 . (W e use the descriptor “b udget balance” if the ISO does not hav e to provide a subsidy , rather than strictly requiring that the total payments are exactly zero). The VCG mechanism, in general, does not satisfy BB. In fact, Green and Laffont [3] show that no mechanism can satisfy all four properties –IC, EF , IR & BB– at the same time. If the ISO knew the true utility functions of all the agents, it could solv e the social welfare problem in a centralized manner: calculate the Lagrange multiplier λ ∗ (price) for the constraint (1a), and collect a payment equal to λ ∗ u ∗ i from agent i . W e call this the Lagrange payment : Definition 5. Suppose the optimal solution ( λ ∗ , u ∗ ) to (1) is unique. W e say that a mechanism is Lagrang e Optimal if the payment p i collected from agent i is equal to λ ∗ u ∗ i . W e note that the Lagrange payment scheme is widely used in current ISO operation, since the ISO solves the social welfare problem with approximate DC power flow . When non-con vexities such as A C po wer flow are incorporated, the Lagrange payment is seldom used because of the duality gap. It is also worth noting that with only constraint (1a), the sum of all Lagrange payments is exactly zero. Howe ver , budget balance generally does not hold when other constraints such as DC power flo w are included [37], [38]. W e show in the sequel that while there is no mechanism that satisfies all four properties (IC, EF , IR and BB) in general , there does exist such a mechanism under a certain “Market Power Balance” condition. W e inflate (or deflate) the first term in the standard VCG mechanism by a constant factor c : p i ( ˆ F ) = c · X j 6 = i ˆ F j ( u ( i ) ) − X j 6 = i ˆ F j ( u ∗ ) . (2) W e call this a Scaled VCG (SVCG) mechanism, and c as the scaling factor . T o achiev e BB and IR, one could choose c as a function of the bids ˆ F i , which unfortunately would cease to guarantee incenti ve compatibility since the first term in (2) is not allo wed to be dependent on ˆ F i in the Groves mechanism. W e show belo w that there is a range of values of c that can ensure BB under a certain market power balance condition, and argue that through its long-term operation, the ISO may be able to learn at least a subset of this range. Our presumptiv e argument rests on the repetitive nature of this problem which is played out e very day , allowing the ISO to be able to tune c to av oid a net subsidy . Based on this experience, the ISO could choose a c for which BB and IR hold at the same time. Howe ver , agents may play a game on the constant c after interacting with the ISO for an extended period of time. Incentiv es or schemes to prev ent such play , which occurs over a slower time-scale, are an important future research direction. Theorem 1. Let u ∗ be the optimal solution to (1) and suppose that u ∗ is unique. Suppose that u ( i ) is the unique optimal solution to: max X j 6 = i F j ( u j ) , subject to X j 6 = i u j = 0 . Let H i := P j 6 = i F j ( u ( i ) ) , and let H max := max i H i . Let F total = P j F j ( u ∗ ) . If F total > 0 , H i > 0 for all i , and the following Market Po wer Balance (MPB) condition holds: ( N − 1) H max ≤ X i H i , (3) then ther e e xists an interval [ c, ¯ c ] such that for any c in this interval, the SVCG mechanism satisfies IC, EF , BB and IR at the same time. Pr oof. W ith c chosen as a constant, the SVCG mechanism is within the Groves class and thus satisfies IC and EF . T o achiev e budget balance, we need X i p i = c X i H i − ( N − 1) F total ≥ 0 . 5 T o achieve individual rationality for agent i , we also need F i ( u ∗ ) − p i = F i ( u ∗ ) − c · H i + X j 6 = i F i ( u ∗ ) ≥ 0 . Combining both the inequalities, we need to be able to choose a c such that ( N − 1) F total P i H i ≤ c ≤ F total H max . (4) Let c := ( N − 1) F total P i H i , and ¯ c := F total H max . Such a c exists if ( N − 1) H max ≤ X i H i , F total > 0 , H i > 0 , ∀ i. The critical condition (3) can be interpreted as requiring that no agent has significantly bigger or smaller market power than others. Individual residential load customers generally hav e a much smaller scale compared to power plants, and it is thus beneficial to form load aggregators or utility companies at the consumer side, as suggested by the MPB condition. This pro- vides an economic justification for the role of load aggregators or load serving entities to guarantee the achiev ement of social welfare maximization. The MPB condition is one sufficient condition for budget balance and individual rationality that we hav e been able to identify . It would be of interest to determine if there are looser conditions than (3) to guarantee BB and IR. In general, an SVCG mechanism is ho wever not Lagrange optimal. W ithin the feasible range [ c, ¯ c ] , one may prefer to choose a c that also achiev es near-Lagrange optimality . This could be formulated as the following MinMax problem: min c max i | d i ( c ) | , subject to (4) , where d i ( c ) := λ ∗ u ∗ i − p i = λ ∗ u ∗ i − c · H i + P j 6 = i F j ( u ∗ ) . Example 1. All agents ha ve quadratic utility functions: F i = r i u 2 i + s i u i . ( r 1 , r 2 , r 3 , r 4 ) = ( − 1 , − 1 . 1 , − 1 . 2 , − 1 . 1) and ( s 1 , s 2 , s 3 , s 4 ) = (1 , 1 . 2 , 4 , 5) . The unique Lagrange optimal solution is u ∗ = ( − 0 . 86 , − 0 . 70 , 0 . 53 , 1 . 03) , λ ∗ = 2 . 73 , and from (4), 1 . 13 ≤ c ≤ 1 . 19 . The optimal solution to the MinMax problem is ( c ∗ , Z ∗ ) = (1 . 14 , 0 . 22) . By choosing c = 1 . 14 , the SVCG mechanism satisfies IC, EF , BB and IR, and the maximum discrepancy between VCG and Lagrange payments is 0.22. In the MinMax problem, one can also replace d i ( c ) by the ratio d i ( c ) /λ ∗ u ∗ i to normalize the nearness of the payment to the Lagrange payment. It also can be written as an LP . Using the abov e ratio, the optimal solution is ( c ∗ , Z ∗ ) = (1 . 18 , 0 . 06) , showing that all agents pay/receive within 6% of their La- grange optimal payment. If we change s 2 to 1 . 8 (a higher marginal cost which implies smaller market power) while keeping the remaining param- eters unchanged, the optimal solution becomes ( c ∗ , Z ∗ ) = (1 . 13 , 0 . 144) , with distortions (6 . 5% , 14 . 4% , 14 . 3% , 7 . 2%) . Therefore, unevenly distributed market power increases the maximum distortion between the SVCG payment and La- grange payment, with a smaller agent facing higher distortion. W e also note that if we further increase s 2 to 2 . 0 , the MPB condition is violated and hence we cannot find a constant c that satisfies both budget balance and individual rationality . B. Static Deterministic Systems with Quadratic Costs: Asymp- totics W e consider N heterogeneous agents with quadratic costs and show that SVCG payments con ver ge to the Lagrange payment as N increases. Let F i ( u i ) = a i u 2 i + b i u i be the concav e quadratic utility function for agent i , whether supplier or consumer . Denote by A = diag ( a 1 , a 2 , . . . , a N ) the diagonal matrix consisting of all the a i , B := [ b 1 ; . . . ; b N ] and U = [ u 1 ; . . . ; u N ] . Suppose A < 0 . The ISO needs to solve: max [ U T AU + B T U ] , subject to 1 T U = 0 . (5) where 1 is the all-one vector of proper size. The solution is: λ ∗ N = γ 1 T A − 1 B , (6) U ∗ N = 1 2 A − 1 λ ∗ N · 1 − B . (7) where γ = tr ace A − 1 − 1 = (1 T A − 1 1) − 1 and index N is used to keep track of the population size. Note also that the optimal social welfare is 1 4 λ 2 1 T A − 1 1 = 1 4 (1 T A − 1 B ) 2 1 T A − 1 1 . Theorem 2. F or the SVCG mechanism with quadratic utility functions, if ( a i , b i ) satisfy the following: 1) a ≤ a i ≤ ¯ a < 0 , 0 < b ≤ b i ≤ ¯ b , 2) ( N − 1) H max ( N ) ≤ P i H i ( N ) , F total ( N ) > 0 , H i ( N ) > 0 , wher e the ar gument N denotes that the corr esponding quantity refers to the system with agents 2 , ...N , then the following holds: 1) Ther e exists a c N satisfying: ( N − 1) F total ( N ) P i H i ( N ) ≤ c N ≤ F total ( N ) H max ( N ) . Mor eover , any such c N satisfies lim N →∞ c N = 1 , 2) lim N →∞ ( λ ∗ N u ∗ N i − p N i ) = 0 , for all i , wher e the superscript N is used to denote that the corr esponding quantity r efers to the system with N agents. Pr oof. It suffices to prove the result for the first agent. Let A − 1 = diag ( a 2 , . . . , a N ) , B − 1 = [ b 2 ; . . . ; b N ] , 1 − 1 be the all- one vector of dimension N − 1 and γ − 1 = tr ace A − 1 − 1 − 1 . Lemma 1. Let U ∗ and W ∗ be the optimal solutions to the pr oblem consisting of all agents, and the pr oblem e xcluding the first agent, respectively . Then, as the number of agents incr eases, lim N →∞ 0 ( N − 1) × 1 I N − 1 U ∗ − W ∗ = O (1 /N )1 − 1 , (8) wher e 0 ( N − 1) × 1 is the N − 1 dimensional column vector of zer oes, and I N − 1 is the N − 1 dimensional identity matrix. Pr oof. From (6) and (7), 0 I U ∗ − W ∗ = = 1 2 0 I a − 1 1 0 0 A − 1 − 1 γ · 1 1 T − 1 a − 1 1 0 0 A − 1 − 1 b 1 B − 1 1 1 − 1 − b 1 B − 1 − 1 2 A − 1 − 1 γ − 1 1 T − 1 A − 1 − 1 B − 1 1 − 1 − B − 1 = 1 2 γ a − 1 1 b 1 + ( γ − γ − 1 ) 1 T − 1 A − 1 − 1 B − 1 A − 1 − 1 1 − 1 . 6 Since a ≤ a i ≤ ¯ a < 0 , γ = Θ(1 /N ) , and γ < 0 , ¯ ab aN ≤ γ a − 1 1 b 1 ≤ a ¯ b ¯ aN , ¯ a 2 − aN ( N − 1) ≤ γ − γ − 1 ≤ a 2 − ¯ aN ( N − 1) , a 2 ¯ b − ¯ a 2 N ≤ ( γ − γ − 1 )1 T − 1 A − 1 − 1 B − 1 ≤ ¯ a 2 b − a 2 N . So, lim N →∞ 0 I U ∗ − W ∗ = 0 . Let 0 I U ∗ = V ∗ . From Lemma 1, v ∗ i − w ∗ i = O ( 1 N ) where v i and w i is the i -th component of V ∗ and W ∗ , respectiv ely . Hence, F total H 1 = a 1 u ∗ 2 1 + b 1 u ∗ 1 + P N i =2 ( a i v ∗ 2 i + b i v ∗ i ) P N i =2 ( a i w ∗ 2 i + b i w ∗ i ) = a 1 u ∗ 2 1 + b 1 u ∗ 1 + P N i =2 a i w ∗ 2 i + b i w ∗ i + G 1 P N i =2 ( a i w ∗ 2 i + b i w ∗ i ) , where G 1 = (2 a i w ∗ i + b i ) O ( 1 N ) + a i O ( 1 N 2 ) . From equations (6) and (7), w ∗ i = Θ(1) . Therefore, lim N →∞ ( F total /H 1 ) = 1 . Similarly , for all other i , lim N →∞ ( F total /H i ) = 1 . Therefore, lim N →∞ ¯ c N = 1 . Let H min = min i H i . Since ( N − 1) F total N H max ≤ c N ≤ ( N − 1) F total N H min , lim N →∞ c N = 1 . Consequently , lim N →∞ c N = 1 . From Lemma 1, W ∗ − V ∗ = − 1 2 γ a − 1 1 b 1 + ( γ − γ − 1 ) ξ A − 1 − 1 1 − 1 , where ξ = 1 T − 1 A − 1 − 1 B − 1 and ξ = Θ( N ) . The payment by Agent 1 is: p N 1 = U ∗ T − 1 A − 1 U ∗ − 1 + B T − 1 U ∗ − 1 − V ∗ T A − 1 V ∗ − B T − 1 V ∗ = ( U ∗ − 1 + V ∗ ) T A − 1 ( U ∗ − 1 − V ∗ ) + B T − 1 ( U ∗ − 1 − V ∗ ) . The difference between Lagrange and VCG payments is: λ ∗ N u ∗ N 1 − p N 1 = 1 2 a 1 γ ( a − 1 1 b 1 + ξ ) γ a − 1 1 b 1 + ξ − b 1 − p N 1 = 1 2 a 1 γ 2 ( a − 1 1 b 1 + ξ ) 2 − b 1 2 a 1 γ ( a − 1 1 b 1 + ξ ) − 1 2 γ a − 1 1 b 1 + ( γ + γ − 1 ) ξ 1 T − 1 − 2 B T − 1 A − 1 − 1 A − 1 + B T − 1 · − 1 2 γ a − 1 1 b − 1 + ( γ − γ − 1 ) ξ A − 1 − 1 1 − 1 = 1 2 a 1 γ 2 ( a − 1 1 b 1 + ξ ) 2 − b 1 2 a 1 γ ( a − 1 1 b 1 + ξ ) + 1 4 γ − 1 γ 2 a − 2 1 b 2 1 + 2 a − 1 1 b 1 γ 2 ξ + ( γ 2 − γ − 2 − 1 ) ξ 2 . Since γ = Θ( 1 N ) , lim N →∞ λ ∗ N u ∗ N 1 − p N 1 = lim N →∞ γ 2 ξ 2 2 a 1 − b 1 γ ξ 2 a 1 + b 1 γ 2 ξ 2 a 1 γ − 1 + ( γ 2 − γ 2 − 1 ) ξ 2 4 γ − 1 = lim N →∞ ξ 2 4 ( 2 γ 2 a 1 + γ 2 − γ 2 − 1 γ − 1 ) − b 1 γ ξ 2 a 1 (1 − γ γ − 1 ) = lim N →∞ ξ 2 4 ( γ 2 a 1 + γ − γ − 1 ) . By calculation, we have γ 2 a 1 + γ − γ − 1 = − 1 a 2 1 " 1 ( P N i =1 1 a i ) 2 ( P N j =2 1 a j ) # = O ( 1 N 3 ) . Therefore, lim N →∞ λ ∗ N u ∗ N 1 − p N 1 = 0 . C. Dynamic Deterministic Systems The VCG scheme can be extended to the important case of deterministic dynamic systems. One can simply consider the entire sequence of actions taken by an agent as a vector action, i.e., as an open-loop control, where the entire decision on the sequence of controls to be employed is taken at the initial time, and so treatable as a static problem. For agent i , let F i ( x i ( t ) , u i ( t )) be its one-step utility function at time t . Suppose that its state e v olves as: x i ( t + 1) = g i ( x i ( t ) , u i ( t )) . (9) An example of a wind turbine model can be found in [6] where x ( t ) = [ x 1 ( t ) , x 2 ( t ) , x 3 ( t )] T , x 1 ( t ) denotes rotor speed, x 2 ( t ) denotes drive train torsion, and x 3 ( t ) denotes generator speed. Control u ( t ) is the collective blade pitch angle and the state equation can be written as ˙ x ( t ) = Ax ( t ) + B u ( t ) . W e assume that F i = −∞ when ( x i ( t ) , u i ( t )) do not satisfy the state dynamic constraint g i or constraints on ( x i ( t ) , u i ( t )) are violated. The ISO asks each agent i to bid its one- step utility functions, state equations and initial condition. Denote the one-step utility function bids made by agent i by { ˆ F i ( x i ( t ) , u i ( t )) , t = 0 , 1 , . . . , T − 1 } , its state equation bids by { ˆ g i , t = 0 , 1 , . . . , T − 1 } , and its initial condition bid by ˆ x i, 0 . The ISO then calculates ( x ∗ i ( t ) , u ∗ i ( t )) as the optimal solution, assumed to be unique, to: max N X i =1 T − 1 X t =0 ˆ F i ( x i ( t ) , u i ( t )) subject to x i ( t + 1) = ˆ g i ( x i ( t ) , u i ( t )) , ∀ i and ∀ t, N X i =1 u i ( t ) = 0 , ∀ t, (10) x i (0) = ˆ x i, 0 , ∀ i. (11) Denote this problem as ( ˆ F , ˆ g , ˆ x 0 ) . W e can extend the VCG payment p i to the deterministic dynamic system. Let p i := X j 6 = i T − 1 X t =0 ˆ F j ( x ( i ) j ( t ) , u ( i ) j ( t )) − X j 6 = i T − 1 X t =0 ˆ F j ( x ∗ j ( t ) , u ∗ j ( t )) . Here ( x ( i ) j ( t ) , u ( i ) j ( t )) is the optimal solution to the follo wing problem, which is assumed to be unique: max X j 6 = i T − 1 X t =0 ˆ F j ( x j ( t ) , u j ( t )) subject to x j ( t + 1) = ˆ g j ( x j ( t ) , u j ( t )) , for j 6 = i and ∀ t, 7 X j 6 = i u j ( t ) = 0 , ∀ t, x j (0) = ˆ x j, 0 , for j 6 = i. More generally , we can consider a Groves payment p i : p i := h i ( ˆ F − i ) − X j 6 = i T − 1 X t =0 ˆ F j ( x ∗ j ( t ) , u ∗ j ( t )) , where h i is any arbitrary function. Theorem 3. T ruth-telling of utility function, state dynamics and initial condition ( ˆ F i = F i , ˆ g i = g i and ˆ x i, 0 = x i, 0 ) is a dominant strate gy equilibrium under the Gr oves mechanism for a dynamic system. Pr oof. Let ˆ F := ( ˆ F 1 , ..., ˆ F i , ..., ˆ F N ) , ˆ g := ( ˆ g 1 ..., ˆ g i , ..., ˆ g N ) , and ˆ x 0 := ( ˆ x 1 , 0 , ..., ˆ x i, 0 , ..., ˆ x N , 0 ) . Suppose agent i announces the true one-step utility function F i , true state dynamics g i , and true initial condition x i, 0 . Let ¯ F := ( ˆ F 1 , ... ˆ F i − 1 , F i , ˆ F i +1 , ..., ˆ F N ) , ¯ g := ( ˆ g 1 , ... ˆ g i − 1 , g i , ˆ g i +1 , ..., ˆ g N ) , and ¯ x 0 := ( ˆ x 1 , 0 , ... ˆ x i − 1 , 0 , x i, 0 , ˆ x i +1 , 0 , ..., ˆ x N , 0 ) . Let ( ¯ x ∗ i ( t ) , ¯ u ∗ i ( t )) be what ISO assigns and p i ( ¯ F , ¯ g , ¯ x 0 ) be what ISO charges when ( ¯ F , ¯ g , ¯ x 0 ) is announced by agents. Let ( x ∗ i ( t ) , u ∗ i ( t )) be what ISO assigns, and p i ( ˆ F , ˆ g , ˆ x 0 ) be what ISO charges when ( ˆ F , ˆ g , ˆ x 0 ) is announced by agents. Define ¯ F ( x i ( t ) , u i ( t )) := P i ¯ F i ( x i ( t ) , u i ( t )) . For agent i , the difference between net utility resulting from announcing ( F i , g i , x i, 0 ) and ( ˆ F i , ˆ g i , ˆ x i, 0 ) is h X t F i ( ¯ x ∗ i ( t ) , ¯ u ∗ i ( t )) − p i ( ¯ F , ¯ g , ¯ x 0 ) i − h X t F i ( x ∗ i ( t ) , u ∗ i ( t )) − p i ( ˆ F , ˆ g , ˆ x 0 ) i = X t F i ( ¯ x ∗ i ( t ) , ¯ u ∗ i ( t )) − h i,t ( ¯ F − i ) + X j 6 = i X t ˆ F j ( ¯ x ∗ i ( t ) , ¯ u ∗ i ( t )) − X t F i ( x ∗ i ( t ) , u ∗ i ( t )) + h i,t ( ˆ F − i ) − X j 6 = i X t ˆ F j ( x ∗ i ( t ) , u ∗ i ( t )) = X t ¯ F ( ¯ x ∗ i ( t ) , ¯ u ∗ i ( t )) − X t ¯ F ( x ∗ i ( t ) , u ∗ i ( t )) (12) There are two cases: 1. When ( x ∗ i ( t ) , u ∗ i ( t )) also satisfy the state dynamic constraint g i , the RHS of (12) ≥ 0 since ( ¯ x ∗ i ( t ) , ¯ u ∗ i ( t )) is the optimal solution to the problem ( ¯ F , ¯ g , ¯ x 0 ) ; 2. When ( x ∗ i ( t ) , u ∗ i ( t )) does not satisfy the state dynamics, F i = −∞ and hence the RHS of (12) ≥ 0 . Theorem 3 extends to the case of state and input constraints since F i = −∞ outside its feasible set. Consider now the Scaled VCG mechanism: p i := c X j 6 = i T − 1 X t =0 ˆ F j ( x ( i ) j ( t ) , u ( i ) j ( t )) − X j 6 = i T − 1 X t =0 ˆ F j ( x ∗ j ( t ) , u ∗ j ( t )) . As in the static case, there exists a range of v alues c that simultaneously achiev es IC, EF , BB and IR. D. Deterministic Linear Systems with Quadratic Costs As in the static case, the SVCG mechanism is asymp- totically Lagrange optimal for linear systems with quadratic costs as the number of agents goes to infinity , as shown below . Consider F i ( x i ( t ) , u i ( t )) = q i x 2 i ( t ) + r i u 2 i ( t ) and x i ( t + 1) = a i x i ( t ) + b i u i ( t ) . Suppose q i ≤ 0 , and r i < 0 . The following Theorem can be viewed as a generalization of Theorem 2 that allows for state dynamics and a horizon larger than 1. Theorem 4. F or the SVCG mechanism with quadratic utility functions and linear state dynamics, if ( a i , b i , p i , q i ) satisfy 1) a ≤ | a i |≤ ¯ a , b ≤ | b i |≤ ¯ b , q ≤ q i ≤ ¯ q < 0 and r ≤ r i ≤ ¯ r < 0 , 2) ( N − 1) H max ( N ) ≤ P i H i ( N ) , F total ( N ) > 0 and H i ( N ) > 0 for all i . Then the following hold: 1) Ther e e xist c N ≤ ¯ c N such that for any c N ∈ [ c N , ¯ c N ] , BB and IR hold. Moreo ver , lim N →∞ c N = 1 , 2) lim N →∞ P t λ ∗ N ( t ) u N i ( t ) − p N i = 0 , for all i . Pr oof. Let X ( t ) = ( x 1 ( t ) , x 2 ( t ) , ..., x N ( t )) T , U ( t ) = ( u 1 ( t ) , u 2 ( t ) , ..., u N ( t )) T , A = diag ( a 1 , a 2 , ..., a N ) , B = diag ( b 1 , b 2 , ..., b N ) , Q = diag ( q 1 , q 2 , ..., q N ) , and R = diag ( r 1 , r 2 , ..., r N ) . W ithout loss of generality , assume x j (0) = 0 for all j . The utility maximization problem can be rewritten as the following Linear-Quadratic (LQ) problem: max T − 1 X t =0 X T ( t ) QX ( t ) + U T ( t ) RU ( t ) (13) subject to X ( t + 1) = AX ( t ) + B U ( t ) , (14) 1 T U ( t ) = 0 , ∀ t. By substituting (14) into (13), and using the fact that open- loop optimal control is equiv alent to the closed-loop optimal solution to LQ problem, we ha ve the following equiv alent augmented LQ problem: max [Ω T ( t ) W Ω( t )+ V T Ω( t )] , subject to Y T Ω( t ) = 0 . (15) where Ω := ( U 1 ; U 2 ; ... ; U N ) , and U i = ( u i (0); u i (1); ... ; u i ( T − 1)) , W and V are formed by multiplication and addition of A, B , Q, R and Y := [ I T ; I T ; ... ; I T ] with N T -dimensional identity matrix I T . More specifically , W can be partitioned into diagonal blocks: W = diag ( W 1 , ..., W N ) , where each block W i is a T × T square matrix consisting of multiplication and addition of a i , b i , q i , r i . Noting that the optimization problem (15) is in the same form as (5), the unique Lagrange multiplier λ is calculated as λ ∗ = Γ Y T W − 1 V , where Γ = ( Y T W − 1 Y ) − 1 . The key to the proof of Theorem 2 is to show that γ is Θ(1 /N ) . (Note that f ( N ) = Ω( g ( N )) if f ( N ) = O ( g ( N )) as well as g ( N ) = Ω( f ( N )) ). Similarly , by expanding Γ = ( W − 1 1 + W − 1 2 + ... + W − 1 N ) − 1 and applying bounded in verse theorem [39], || Γ || is also Θ(1 /N ) since a i , b i , q i , r i are all uniformly bounded. Let Ω ∗ be the optimal solution to problem (15) consisting of all agents and let Ψ ∗ be the optimal solution to the problem 8 excluding the first agent. By replacing A , B and 1 with W , V and Y respectiv ely , lim N →∞ 0 ( N − 1) T × T I ( N − 1) T Ω ∗ − Ψ ∗ = 0 . Let 0 I Ω ∗ = Φ ∗ . From abov e, Φ ∗ i − Ψ ∗ i = O ( 1 N )1 where Φ i and Ψ i is the i -th T -length component of Φ ∗ and Ψ ∗ , respectiv ely . Hence, F total H 1 = U ∗ T 1 W 1 U ∗ 1 + V T 1 U ∗ 1 + P N i =2 (Φ ∗ T i W i Φ ∗ i + V T i Φ ∗ i ) P N i =2 (Ψ ∗ T i W i Ψ ∗ i + V T i Ψ ∗ i ) = U ∗ T 1 W 1 U ∗ 1 + V T 1 U ∗ 1 + P N i =2 (Ψ ∗ T i W i Ψ ∗ i + V T i Ψ ∗ i + G 1 ) P N i =2 (Ψ ∗ T i W i Ψ ∗ i + V T i Ψ ∗ i ) where G 1 = (2Ψ ∗ T i W i 1 + V T i 1) O ( 1 N ) + 1 T W i 1 · O ( 1 N 2 ) . Since Ψ ∗ i = Θ(1)1 , we ha ve lim N →∞ F N total /H N 1 = 1 . Similarly , for all other i , lim N →∞ F N total /H N i = 1 . Therefore, lim N →∞ ¯ c N = 1 . Let H min = min i H i . Since ( N − 1) F total N H max ≤ c N ≤ ( N − 1) F total N H min , lim N →∞ c N = 1 . Consequently , lim N →∞ c N = 1 . From Lemma 1, Ψ ∗ − Φ ∗ = − 1 2 W − 1 − 1 Y − 1 (Γ W − 1 − 1 V − 1 + (Γ − Γ − 1 )Ξ) , where W − 1 , V − 1 are formed by removing W 1 and V 1 from W and V , respectively . Y − 1 = [ I T ; ... ; I T ] with ( N − 1) T - dimensional identity matrix. Ξ = Y − 1 − 1 W − 1 − 1 V − 1 and Ξ = O ( N )1 . Similarly as in Theorem 2, lim N →∞ λ ∗ T U ∗ 1 − p N 1 = lim N →∞ " 1 2 ( V T 1 W − 1 1 + Ξ T )Γ T [ W − 1 1 Γ( W − 1 1 V 1 + Ξ) − W − 1 1 V 1 ] − 1 2 h Γ W − 1 − 1 V − 1 + (Γ + Γ − 1 )Ξ T Y T − 1 − 2 V − 1 i T W − 1 − 1 W − 1 + V T − 1 · − 1 2 W − 1 − 1 Y − 1 (Γ W − 1 − 1 V − 1 + (Γ − Γ − 1 )Ξ) # = lim N →∞ 1 4 Ξ T Γ T W − 1 1 Γ + Γ − Γ − 1 Ξ It is straightforward to see that, Γ T W − 1 1 Γ + Γ − Γ − 1 = − N X i =1 W − 1 i ! − 1 W − 1 1 N X i =1 W − 1 i ! − 1 W − 1 1 N X i =2 W − 1 i ! − 1 = O ( 1 N 3 ) . Consequently , lim N →∞ λ ∗ T U ∗ i − p N i = 0 . I V . D Y N A M I C S T O C H A S T I C V C G In the previous section, the VCG mechanism was naturally extended to deterministic dynamic systems by employing an open-loop solution. A new complication arises when agents are stochastic dynamic systems. The states of agents ev olve randomly , and so we need to consider closed-loop control laws for each agent. Such closed-loop control laws depend on the observations of the agents, which are generally priv ate random variables. Hence the problem therefore arises of ensuring that each agent reveals its “true” observ ation at eac h and every time instant . (Other unknowns such as system dynamic equations, noise statistics, and utility functions, can be considered as part of the first observation). Howe ver , a fundamental difficulty arises with respect to ensuring social welfare optimality of the stochastic dynamic system. Since an agent’ s intertemporal payoff depends on the future payments and allocations in a dynamic game, the agent’ s current bid need not maximize its current payof f. What’ s more, since dishonest bids distort current and future allocations in different ways, an agent’ s optimal bid will depend on others’ bids. T o see this, it is sufficient to consider the case where all system parameters – system dynamics, noise statistics, and utility functions – are known to all agents, and where each agent can completely observe its o wn pri vate state x i ( t ) , with the only complication being that it cannot observe the states of other agents. For agent i , let w i ( t ) be the discrete-time noise process affecting state x i ( t ) via the state ev olution equation: x i ( t + 1) = g i ( x i ( t ) , u i ( t ) , w i ( t )) , where x i (0) is independent of w i . An LQG model for wind turbine control can be found in [6], where w ( t ) is the turbine system noise. The uncertainties of all the agents are indepen- dent. The ISO aims to maximize the social welfare: max E N X i =1 T − 1 X t =0 F i ( x i ( t ) , u i ( t )) subject to x i ( t + 1) = g i ( x i ( t ) , u i ( t ) , w i ( t )) , ∀ t. (16) N X i =1 u i ( t ) = 0 , for ∀ t. (17) W e will assume that F i , g i and the distrib utions of the uncertainties are known to the ISO. W e comment in the sequel on the further difficulty that arises when they are unknown. W e focus on the issue of truth-telling by the agents of their states. Suppose that agents bid their states x i ( t ) as ˆ x i ( t ) . A straightforward extension of the static Groves mechanism, which we will see does not work, would be to collect a payment p i ( t ) at time t from agent i , defined as p i ( t ) = h i ( ˆ X − i ( t )) − E X j 6 = i T − 1 X τ = t h F j ( x j,t ( τ ) , u ∗ j,t ( τ )) | X ( t ) = ˆ X ( t ) i , where ˆ x i ( t ) is what agent i bids for his state at time t , ˆ X − i ( t ) = [ ˆ x 1 ( t ) , ..., ˆ x i − 1 ( t ) , ˆ x i +1 ( t ) , ..., ˆ x N ( t )] T , u ∗ j,t ( τ ) is the optimal solution to the following problem: max E N X j =1 T − 1 X τ = t h F j ( x j ( τ ) , u j ( τ )) | X ( t ) = ˆ X ( t ) i subject to x j ( τ + 1) = g j ( x j ( τ ) , u j ( τ ) , w j ( τ )) , 9 N X j =1 u j ( τ ) = 0 , for t ≤ τ ≤ T − 1 , (18) ˆ X ( t ) = [ ˆ x 1 ( t ) , ..., ˆ x N ( t )] T , and x j,t ( τ + 1) = g j ( x j,t ( τ ) , u j,t ( τ ) , w j,t ( τ )) . It is easy to verify that truth-telling of states by all agents forms a subgame perfect Nash equilibrium since truth-telling of x i ( t ) for agent i is a best response given that all other agents bid truthfully for all τ ≥ t . Howe ver , truth-telling of states does not constitute a dom- inant strategy because another agent j may bid ˆ x j ( t + 1) at time t + 1 truthfully , but lie about the state x j ( t ) at time t in order to obtain a preferable state at the next time t + 1 . More specifically , if we assume all agents will bid truthfully from t + 1 onward, then at time t , if agent j bids some untruthful ˆ x j ( t ) , truth-telling of state for agent i will be an optimal strategy only if agent j continues to bid “an untruthful but consistent” ˆ x j ( t ) which stems from his untruthful bid ˆ x j ( t ) . By “consistent” we mean the state that would result from the untruthful ˆ x j ( t ) but with the truthful state noise w j ( t ) . In other words, agent i ’ s will bid truthfully only if agent j “consistently” lies about his state, which is not guaranteed using the above payment scheme. This additional complication precludes a dominant strategy solution for general stochastic dynamic systems even in the completely observed case , and ev en when all system param- eters (the system dynamic equations, the noise statistics, and the utility functions) are known to all agents. W e conjecture that there does not exist a dominant strate gy for each agent that ensures social welfare optimality even in this special context, when agents are general stochastic dynamic systems. All that one can possibly hope for in general is a subgame perfect Nash equilibrium where truth telling by an agent is optimal when all other agents are telling the truth. W e show in the sequel that there is one important exception: LQG agents. V . L I N E A R Q UA D R A T I C G A U S S I A N S Y S T E M S W e no w show that while an incenti ve compatible strategy presents fundamental challenges for general stochastic dy- namic systems even when system parameters are known to all, as noted above, there is a solution for LQG systems in both the completely observed case where each agent observes its own priv ate state, and in the partially observed case where each agent observes a linear transformation of its state corrupted by white Gaussian noise. In fact, we show that one can then ev en obtain the follo wing stronger property: Definition 6. W e say that a mechanism attains a subgame perfect dominant strate gy equilibrium if, at ev ery time t , truth- telling by each agent of its remaining future pri v ate observa- tions is optimal with respect to the conditional expectation of its remaining net utility , irrespectiv e of the strategies of other agents in the future. W e inv estigate the structure of LQG systems more carefully . W e begin by considering the completely observed case. For agent i , let w i ( t ) ∼ N (0 , σ i ) be the discrete-time additiv e Gaussian white noise process affecting state x i ( t ) via: x i ( t + 1) = a i x i ( t ) + b i u i ( t ) + w i ( t ) , where x i (0) ∼ N (0 , ζ i ) and is independent of w i . Each agent has a one-step utility function F i ( x i ( t ) , u i ( t )) = q i x 2 i ( t ) + r i u 2 i ( t ) . W e suppose that q i ≤ 0 and r i < 0 . Let X ( t ) = [ x 1 ( t ) , ..., x N ( t )] T , U ( t ) = [ u 1 ( t ) , ..., u N ( t )] T and W ( t ) = [ w 1 ( t ) , ..., w N ( t )] T . Let Q = diag ( q 1 , ..., q N ) ≤ 0 , R = diag ( r 1 , ..., r N ) < 0 , A = diag ( a 1 , ..., a N ) , B = diag ( b 1 , ..., b N ) , Σ = diag ( σ 1 , ..., σ N ) > 0 and Z = diag ( ζ 1 , ..., ζ N ) > 0 . W e assume that the ISO kno ws the true system parameters A , B , Q and R . Let RS W := P T − 1 t =0 [ X T ( t ) QX ( t ) + U T ( t ) RU ( t )] be the random social welfare, i.e., the variable whose expectation is the social welfare of the agents, and let S W := E [ RS W ] denote the ( expected ) social welfare. RS W could also be called the “ex-post social welfare”, while S W could be called the “ex-ante social welfare. ” The ISO aims to maximize the social welfare: max E T − 1 X t =0 X T ( t ) QX ( t ) + U T ( t ) RU ( t ) subject to X ( t + 1) = AX ( t ) + B U ( t ) + W ( t ) , 1 T U ( t ) = 0 , ∀ t, (19) X (0) ∼ N (0 , Z ) , W ∼ N (0 , Σ) . The ke y to obtaining subgame perfect dominance is to in- troduce a “layered” payment structure which ensures incentive compatibility for LQG systems. W e begin by re writing the random social welfare, and thereby also the social welfare, in terms more conv enient for us. W e will decompose the state X ( t ) of the entire system comprised of all agents as: X ( t ) := t X s =0 X ( s, t ) , 0 ≤ t ≤ T − 1 , (20) where X ( s, s ) := W ( s − 1) for s ≥ 1 and X (0 , 0) := X (0) . Let X ( s, t ) := AX ( s, t − 1) + B U ( s, t − 1) , 0 ≤ s ≤ t − 1 , (21) with U ( s, t ) yet to be specified. W e suppose that U ( t ) can also be decomposed as: U ( t ) := t X s =0 U ( s, t ) , 0 ≤ t ≤ T − 1 . (22) Then re gar dless of how the U ( s, t ) ’s ar e chosen , as long as the U ( s, t ) ’ s for 0 ≤ s ≤ t are indeed a decomposition of U ( t ) , i.e., (22) is satisfied, the random social welfare can be written in terms of X ( s, t ) ’ s and U ( s, t ) ’ s as: RS W = T − 1 X s =0 L s , 10 where L s for s ≥ 1 is defined as: L s : = T − 1 X t = s X T ( s, t ) QX ( s, t ) + U T ( s, t ) RU ( s, t ) (23) +2 s − 1 X τ =0 X ( τ , t ) ! QX ( s, t ) + 2 s − 1 X τ =0 U ( τ , t ) ! RU ( s, t ) , and L 0 is defined as: L 0 := T − 1 X t =0 h X T (0 , t ) QX (0 , t ) + U T (0 , t ) RU (0 , t ) i . Hence, S W = E P T − 1 s =0 L s . In the scheme to follo w , the ISO will choose all U ( s, t ) ’ s for future t ’ s at time s , based on the information it has at time s . Hence X ( s, t ) is completely determined by W ( s − 1) , and U ( s, t ) for s ≤ t ≤ T − 1 . Indeed X ( s, t ) can be regarded as the contribution to X ( t ) of these v ariables. W e now define the LQG ISO Mec hanism . Instead of asking agents to bid their state, we will consider a scheme where agents will be asked to bid their state noises . At each stage s , the ISO asks each agent i to bid its x i ( s, s ) , defined as equal to w i ( s − 1) . Let ˆ x i ( s, s ) be what the agent actually bids, since it may not tell the truth. Based on their bids { ˆ x i ( s, s ) for 1 ≤ i ≤ N } , the ISO solv es the following problem: max L s for the system ˆ X ( s, t ) = A ˆ X ( s, t − 1) + B U ( s, t − 1) , for t > s, with ˆ X ( s, s ) = [ ˆ x 1 ( s, s ) , ..., ˆ x N ( s, s )] T , subject to the constraint 1 T U ( s, t ) = 0 , for s ≤ t ≤ T − 1 . Here ˆ X ( s, t ) is the zero-noise state variable updates starting from the “initial condition” ˆ X ( s, s ) . Let U ∗ ( s, t ) denote the optimal solution. The interpretation is the following. Based on the bids, ˆ X ( s, s ) , which is supposedly a bid of W ( s − 1) , the ISO calculates the trajectory of the linear systems from time s onward, assuming zero state noise from that point on. It then allocates consumptions/generations U ( s, t ) for future periods t for the corresponding deterministic linear system, with balance of consumption and production (19) at each time t . These can be regarded as generation/consumption allocations taking into account the consequences of the disturbance occurring at time s . Thereby we are decomposing the behavior of the system into separate ef fects caused by the state noise random v ariables occurring at different times. Next, the ISO collects a payment p i ( s ) from agent i at time s as: p i ( s ) := h i ( ˆ X − i ( s, s )) − X j 6 = i T − 1 X t = s q j ˆ x 2 j ( s, t ) + r j u ∗ 2 j ( s, t ) +2 q j s − 1 X τ =0 ˆ x j ( τ , t ) ! ˆ x j ( s, t ) + 2 r j s − 1 X τ =0 u j ( τ , t ) ! u ∗ j ( s, t ) , where ˆ X − i ( s, s ) = [ ˆ x 1 ( s, s ) , ..., ˆ x i − 1 ( s, s ) , ˆ x i +1 ( s, s ) , ... , ˆ x N ( s, s )] T , and h i is any arbitrary function (as in the Groves mechanism). Before proving incenti ve compatibility , we need to define what is meant by “rationality” of an agent in a dynamic system where each agent has to take actions at different times. Definition 7. Rational Agents: W e say agent i is rational at time T − 1 , if it adopts a dominant strategy whene ver there exists a unique dominant strategy . An agent i is rational at time t if it adopts a dominant strategy at time t under the assumption that all agents including itself are rational at times t + 1 , t + 2 , ..., T − 1 , whenever there is a unique such dominant strategy at time t . A critical property of the LQ problem is that the optimal feedback gain does not depend on the state. A key result needed to show dominance of truth telling is to achiev e “intertemporal” decoupling. In the case of the quadratic cost, this will be achieved by showing the follo wing Lemma in the sequel: Lemma 2. (Intertemporal Decoupling for LQG Agents) Let H k be the history up to time k , H k := { w i ( t ) : 1 ≤ i ≤ N , 0 ≤ t ≤ k } . Then, under the LQG ISO Mec hanism, if system parameter s Q ≤ 0 , R < 0 , A and B ar e known, and agents ar e rational, then for k < t ≤ T − 1 , E [ X ( k , t ) |H k − 1 ] = 0 , and E [ U ( k , t ) |H k − 1 ] = 0 . The proof of this is provided as part of the follo wing ov erarching result. Theorem 5. T ruth-telling of state ˆ x i ( s, s ) for 0 ≤ s ≤ T − 1 , i.e., bidding ˆ x i ( s, s ) = w i ( s − 1) , is the unique dominant strate gy for each agent i under the LQG ISO Mechanism, if system parameter s Q ≤ 0 , R < 0 , A and B ar e known, and agents ar e rational. The LQG ISO Mechanism achieves social welfar e optimization. Pr oof. W e sho w the result by backward induction. Let Agent j , j 6 = i bid ˆ x j ( s, s ) at time s . Giv en the bids ˆ x j ( s, s ) of other agents, let J i ( s ) be the net utility of agent i from time s onward if it bids truthful x i ( s, s ) , i.e., w i ( s − 1) , and let ˆ J i ( s ) be the net utility if it bids possibly untruthful ˆ x i ( s, s ) . Let U ∗ ( s, t ) be the ISO’ s assignments if agent i bids truthfully and let ˆ U ∗ ( s, t ) be the ISO’ s assignments if agent i bids untruthfully . W e first consider time T − 1 , since we are employing back- ward induction. Suppose that x i ( s, T − 1) for 0 ≤ s ≤ T − 2 were the past bids, and u i ( s, T − 1) for 0 ≤ s ≤ T − 2 , were those portions of the allocations for the future already decided in the past. Our interest is on analyzing what should be the current bid x i ( T − 1 , T − 1) , and the consequent additional allocation u i ( T − 1 , T − 1) . No w x i ( s, T − 1) for 0 ≤ s ≤ T − 2 depend only on previous bids x i ( s, s ) , and thus those terms can be treated as constants. In addition, the h i term depends only on other agents’ bids. As a consequence, when comparing 11 J i ( T − 1) with ˆ J i ( T − 1) , one can just regard h i ≡ 0 . Hence J i ( T − 1) = q i x 2 i ( T − 1 , T − 1) + r i u ∗ 2 i ( T − 1 , T − 1) + 2 q i T − 2 X s =0 x i ( s, T − 1) ! x i ( T − 1 , T − 1) + 2 r i T − 2 X s =0 u i ( s, T − 1) ! u ∗ i ( T − 1 , T − 1) + X j 6 = i q j ˆ x 2 j ( T − 1 , T − 1) + r j u ∗ 2 j ( T − 1 , T − 1) + 2 q j T − 2 X τ =0 ˆ x j ( τ , T − 1) ! ˆ x j ( T − 1 , T − 1) + 2 r j T − 2 X τ =0 u j ( τ , T − 1) ! u ∗ j ( T − 1 , T − 1) . It is seen that J i ( T − 1) is of the same form as L T − 1 . ˆ J i ( T − 1) is obtained by replacing u ∗ i with ˆ u ∗ i . W e conclude that J i ( T − 1) ≥ ˆ J i ( T − 1) because u ∗ i is the optimal solution to L T − 1 when ˆ x i ( T − 1 , T − 1) = x i ( T − 1 , T − 1) . Moreover truth telling is the unique optimal strategy since it is a finite-horizon, discrete-time LQR problem. W e next employ induction, and so assume that truth-telling of states is the unique subgame perfect dominant strategy equilibrium at time k . Let H t be the history up to time t . If agents are rational, we can take the expectation o ver future X ( s, s ) , s ≥ k , which are i.i.d. Gaussian noise vectors, and calculate J i ( k − 1) (where, as before, we simply take the first Grov es term h i ≡ 0 ): J i ( k − 1) = q i x 2 i ( k − 1) + r i u 2 i ( k − 1) − p i ( k − 1) + E [ J i ( k ) |H k − 1 ] = q i " x i ( k − 1 , k − 1) + k − 2 X s =0 x i ( s, k − 1) # 2 + r i " u i ( k − 1 , k − 1) + k − 2 X s =0 u i ( s, k − 1) # 2 − p i ( k − 1) + E " T − 1 X t = k q i x 2 i ( t ) + r i u 2 i ( t ) − p i ( t ) H k − 1 # . (24) W e now prov e Lemma 2. W e first show that E [ U ∗ ( k , k ) |H k − 1 ] = 0 . By completing the square for L k in (23), we have the following equiv alent problem for the ISO to solve for the k -th layer: max T − 1 X t = k " X ( k , t ) + k − 1 X τ =0 X ( τ , t ) ! T Q · X ( k , t ) + k − 1 X τ =0 X ( τ , t ) ! + U ( k , t ) + k − 1 X τ =0 U ( τ , t ) ! T R · U ( k , t ) + k − 1 X τ =0 U ( τ , t ) ! # . (25) Now , for the fixed k of interest, letting Y ( t ) := X ( k , t ) + P k − 1 τ =0 X ( τ , t ) , and V ( t ) := U ( k , t ) + P k − 1 τ =0 U ( τ , t ) , we hav e Y ( t ) = AY ( t − 1) + B V ( t − 1) for t ≥ k + 1 . The “initial” condition is Y ( k ) = X ( k ) . For this linear system, the optimal control law for cost (25) under the balancing constraint for all t is linear in the state [35]. Denoting the optimal gain by K ( t ) (whose calculation can be found in [40]), U ∗ ( k , k ) + k − 1 X τ =0 U ( τ , k ) = K ( k ) " X ( k , k ) + k − 1 X τ =0 X ( τ , k ) # . Similarly , at time k − 1 , the ISO chooses the allocation at time k by using the same gain K ( t ) applied to that portion of the state at time k resulting from disturbances up to time k − 1 : U ( k − 1 , k ) + k − 2 X τ =0 U ( τ , k ) = k − 1 X τ =0 U ( τ , k ) = K ( k ) · " X ( k − 1 , k ) + k − 2 X τ =0 X ( τ , k ) # = K ( k ) " k − 1 X τ =0 X ( τ , k ) # . Consequently , E [ U ∗ ( k , k ) |H k − 1 ] = K ( k ) E [ X ( k , k ) |H k − 1 ] = 0 , since all agents are truth-telling at time k , i.e., E [ X ( k , k ) |H k − 1 ] = E [ W ( k − 1)] = 0 . From (21), by linearity of the system, E [ X ( k , t ) |H k − 1 ] = 0 , k < t ≤ T − 1 , and E [ U ( k , t ) |H k − 1 ] = 0 , k < t ≤ T − 1 . Therefore, for k ≤ t ≤ T − 1 , E [ x 2 i ( t ) |H k − 1 ] = E " t X τ = k x i ( τ , t ) + k − 1 X s =0 x i ( s, t ) # 2 = " k − 1 X s =0 x i ( s, t ) # 2 + C = x 2 i ( k − 1 , t ) + 2 x i ( k − 1 , t ) · k − 2 X s =0 x i ( s, t ) + " k − 2 X s =0 x i ( s, t ) # 2 + C , where C is a fixed term corresponding to the variance of P t τ = k x i ( τ , t ) and h P k − 2 s =0 x i ( s, t ) i 2 can be treated as a con- stant since it depends only previous bids. Similarly , for t ≥ k , E [ u 2 i ( t ) |H k − 1 ] = u 2 i ( k − 1 , t ) + 2 u i ( k − 1 , t ) k − 2 X s =0 u i ( s, t ) + " k − 2 X s =0 u i ( s, t ) # 2 + C , W e also have, E [ p i ( t ) |H k − 1 ] = const., 12 since E [ x j ( t, τ ) |H k − 1 ] = 0 and E [ u j ( t, τ ) |H k − 1 ] = 0 , for τ ≥ t . By ignoring the constant term, J i ( k − 1) = q i x 2 i ( k − 1 , k − 1) + 2 q i x i ( k − 1 , k − 1) k − 2 X s =0 x i ( s, k − 1) + r i u 2 i ( k − 1 , k − 1) + 2 r i u i ( k − 1 , k − 1) k − 2 X s =0 u i ( s, k − 1) + T − 1 X t = k " q i x 2 i ( k − 1 , t ) + 2 q i x i ( k − 1 , t ) k − 2 X s =0 x i ( s, t ) # + T − 1 X t = k " r i u 2 i ( k − 1 , t ) + 2 r i u i ( k − 1 , t ) k − 2 X s =0 u i ( s, t ) # − p i ( k − 1) = T − 1 X t = k − 1 " q i x 2 i ( k − 1 , t ) + r i u 2 i ( k − 1 , t ) + 2 q i k − 2 X τ =0 x i ( τ , t ) ! x i ( k − 1 , t ) + 2 r i k − 2 X τ =0 r i ( τ , t ) ! r i ( k − 1 , t ) # − p i ( k − 1) . It is straightforward to check that J i ( k − 1) is of the same form as L k − 1 and thus we conclude that truth-telling ˆ x i ( k − 1 , k − 1) = x i ( k − 1 , k − 1) is the unique dominant strategy for agent i at time k − 1 . In the proof we hav e actually established the following stronger result: Corollary 1. In the stochastic VCG mechanism, truth-telling of states constitutes a subgame perfect dominant strategy equilibrium. A. LQG systems with unknown system parameters The case where the ISO does not kno w the system param- eters, system dynamic equations, noise statistics, and utility functions, poses formidable difficulties and we conjecture that there is no mechanism with truth telling as a dominant strategy . Abov e, the key to pro ving incentiv e compatibility for the layered VCG mechanism lies in the fact that the optimal feedback gain K ( k ) remains unchanged for each round of bids. This is due to the fact that K ( k ) is only a function of Q , R , A , and B . Therefore, if bidding of system parameters at the beginning is allowed, then the layered VCG mechanism is not incentive compatible. W e show this by the following counterexample. Example 2. Let T = 4 . The agents’ system equations and cost matrices hav e the following parameters: ( a 1 , a 2 , a 3 , a 4 ) = (1 , 1 , 1 , 1) , ( b 1 , b 2 , b 3 , b 4 ) = (1 , 1 , 1 , 1) , ( q 1 , q 2 , q 3 , q 4 ) = ( − 1 , − 1 , − 1 , − 1) , ( r 1 , r 2 , r 3 , r 4 ) = ( − 1 , − 1 . 1 , − 1 . 2 , − 1 . 1) , ( ζ 1 , ζ 2 , ζ 3 , ζ 4 ) = (0 . 3 , 0 . 32 , 0 . 31 , 0 . 3) and ( σ 1 , σ 2 , σ 3 , σ 4 ) = (0 . 1 , 0 . 11 , 0 . 11 , 0 . 12) . If system operator knows all the pa- rameters of agents, and every agent bid its true state, then the expected net utility of agent 1 (e xpected total utility minus expected total payment) is 0 . 629 . When agents are also allowed to bid their system parameters at the beginning, truth- telling of state may not be incentive compatible. Suppose that agents 2 , 3 , 4 remain truthful, namely , bid their true system parameters at the beginning and their true states at all times. Suppose now that agent 1 intentionally bids an untruthful ˆ q 1 = − 1 . 3 while bidding other parameters truthfully at the beginning. Assume also that agent 1 always bids its state as if there is no noise ( w 1 ( t ) ≡ 0 ). No w agent 1’ s net expected utility is 0 . 631 . Therefore, agent 1 ’ s optimal strategy is not to bid its true state when it is allo wed to bid its system parameters at the beginning. The assumption that the ISO knows the system parameters A , B , Q and R of the agents can perhaps be justified since the ISO can learn these parameters by running a VCG scheme for the day-ahead market, a dynamic deterministic market, where agents are guaranteed to bid their true system parameters as shown in the previous section, and system parameters remain unchanged when agents participate in the real-time stochastic market. B. Budget Balance and Individual Rationality in LQG systems W e extend the notion of scaling and the associated SVCG mechanism to the stochastic dynamic systems as follows. Consider the payments p i ( s ) := c · X j 6 = i T − 1 X t = s q j ˆ x 2 j ( s, t ) + r j u ( i )2 j ( s, t ) + 2 q j s − 1 X τ =0 ˆ x j ( τ , t ) ! ˆ x j ( s, t ) + 2 r j s − 1 X τ =0 u ( i )( τ ,t ) j ! u ( i ) j ( s, t ) − X j 6 = i T − 1 X t = s q j ˆ x 2 j ( s, t ) + r j u ∗ 2 j ( s, t ) + 2 q j s − 1 X τ =0 ˆ x j ( τ , t ) ! ˆ x j ( s, t ) + 2 r j s − 1 X τ =0 u j ( τ , t ) ! u ∗ j ( s, t ) , where u ( i ) j ( s, t ) is the optimal solution to: max X j 6 = i T − 1 X t = s q j x 2 j ( s, t ) + u 2 j ( s, t ) + 2 q j s − 1 X τ =0 x j ( τ , t ) ! x j ( s, t ) + 2 r j s − 1 X τ =0 u j ( τ , t ) ! u j ( s, t ) subject to x j ( s, t ) = a j x j ( s, t − 1) + b j u j ( s, t − 1) , for s < t ≤ T − 1 , X j 6 = i u j ( s, t ) = 0 , for s ≤ t ≤ T − 1 , x j ( s, s ) = ˆ x j ( s, s ) . As in the static case, based on its prior knowledge of a suitable range for c , the ISO can choose a range of c , which does not depend on the agents’ bids, to achieve BB and IR. 13 T ruth-telling is a dominant strategy under the SVCG mech- anism because it falls under the Groves mechanism. Under the dominant strategy equilibrium, every agent i will bid its true state x i ( s, s ) , i.e., w i ( s − 1) . Theorem 6. Let U ∗ ( t ) be the optimal solution to the following pr oblem: max E T − 1 X t =0 [ X T ( t ) QX ( t ) + U T ( t ) RU ( t )] subject to X ( t + 1) = AX ( t ) + B U ( t ) + W ( t ) , 1 T U ( t ) = 0 , ∀ t, X (0) ∼ N (0 , Z ) , W ∼ N (0 , Σ) . Let X ( i ) ( t ) := [ x 1 ( t ) , ..., x i − 1 ( t ) , x i +1 ( t ) , ...x N ( t )] T , and similarly let Q ( i ) , R ( i ) , A ( i ) , B ( i ) , Z ( i ) and Σ ( i ) be the matrix with the i -th component r emoved. Let U ( i ) ( t ) be the optimal solution to the following pr oblem: max E T − 1 X t =0 [ X ( i ) T ( t ) Q ( i ) X ( i ) ( t ) + U T ( t ) R ( i ) U ( t )] subject to X ( i ) ( t + 1) = A ( i ) X ( i ) ( t ) + B ( i ) U ( t ) + W ( i ) ( t ) , 1 T U ( t ) = 0 , ∀ t, X ( i ) (0) ∼ N (0 , Z ( i ) ) , W ( i ) ∼ N (0 , Σ ( i ) ) . Let H i := E P T − 1 t =0 [ X ( i ) T ( t ) Q ( i ) X ( i ) ( t ) + U ( i ) T ( t ) R ( i ) U ( i ) ( t )] and let H max := max i H i . Let F total = E P T − 1 t =0 [ X T ( t ) QX ( t )+ U T ( t ) RU ( t )] . If F total > 0 , H i > 0 for all i , and MPB (3) condition holds, there exists an c and ¯ c , with c ≤ ¯ c such that if the constant c is chosen in the range [ c, ¯ c ] , then the SVCG mechanism for the deterministic dynamic system satisfies IC, EF , BB and IR at the same time. Pr oof. It is straightforward that under a dominant strategy , E T − 1 X s =0 p i ( s ) = c · H i − E X j 6 = i T − 1 X t =0 q j x 2 j ( t ) + r j u ∗ 2 j ( t ) since w i ’ s are i.i.d. and u i ( t ) is linear in x i ( t ) . Hence, to achiev e budget balance, we need, E X i T − 1 X s =0 p i ( s ) = c · X i H i − ( N − 1) F total ≥ 0 . T o achieve individual rationality for agent i , we need E " T − 1 X t =0 q i x 2 i ( t ) + r i u ∗ 2 i ( t ) − T − 1 X s =0 p i ( s ) # = F total − c · H i ≥ 0 . Combining both inequalities, we have ( N − 1) F total P i H i ≤ c ≤ F total H max . Let c = ( N − 1) F total P i H i and ¯ c = F total H max . T o ensure c ≤ ¯ c , one sufficient condition is, ( N − 1) H max ≤ X i H i , F total > 0 , H i > 0 for all i. C. Lagrange Optimality in LQG Systems In general, just as for a static problem, the SVCG mecha- nism is not Lagrange optimal. W ithin the feasible range [ c, ¯ c ] , one can choose a c that achieves near Lagrange optimality . This can be formulated as a MinMax problem: min c max i d i ( c ) E P T − 1 t =0 [ λ ∗ ( t ) u ∗ i ( t )] , subject to (4) , where d i ( c ) := E T − 1 X t =0 [ λ ∗ ( t ) u ∗ i ( t ) − p i ( t )] = E T − 1 X t =0 [ λ ∗ ( t ) u ∗ i ( t )] − c · H i + E " T − 1 X t =0 q j x 2 j ( t ) + r j u ∗ 2 j ( t ) # . Example 3. Consider the same system parameters as in Example 2. The optimal solution to the MinMax problem is ( c ∗ , Z ∗ ) = (0 . 96 , 0 . 21) . Thus, by choosing c = 0 . 96 , the SVCG mechanism satisfies IC, EF , BB and IR, and all agents expect to pay/receiv e within 21% of their expected Lagrange optimal payments. Just as for deterministic systems, as the number of agents in- creases, the scaled-VCG mechanism does achieve asymptotic Lagrange Optimality . Theorem 7. If ( a k , b k , p k , q k , ζ k , σ k ) for 1 ≤ k ≤ N satisfy the following: 1) a ≤ | a i |≤ ¯ a , b ≤ | b i |≤ ¯ b , q ≤ q i ≤ ¯ q < 0 and r ≤ r i ≤ ¯ r < 0 2) F total > 0 , H i > 0 for all i , and MPB condition holds, then the following holds: 1) Ther e is a r ange within which c N can be chosen to achie ve BB and IR, and lim N →∞ c N = 1 , 2) Asymptotic Lagrang e Optimality: lim N →∞ E P T − 1 t =0 λ N ( t ) u N i ( t ) − p N i ( t ) = 0 , where the random variable λ N ( t ) is the Lagrang e multiplier corr esponding to the power balance constraint. Pr oof. At each layer , the ISO is solving a deterministic LQR problem, and from Theorem 4 we have, lim N →∞ L ∗ s H s, 1 = 1 , for 0 ≤ s ≤ T − 1 , where L ∗ s is the maximum value of L s , and H s, 1 is the maximum social welfare when agent 1 is excluded. Moreover , as we hav e shown in Theorem 5, the sum of U ∗ ( s, t ) cal- culated at each layer is indeed the optimal solution U ∗ ( t ) = P t s =0 U ∗ ( s, t ) . Consequently , lim N →∞ F N total H N 1 = lim N →∞ E P T − 1 s =0 L ∗ s E P T − 1 s =0 H s, 1 = 1 . 14 Similarly we can sho w that lim N →∞ F N total H N k = 1 for k 6 = 1 . Therefore, lim N →∞ ¯ c N = 1 . Let H min = min H i . Since ( N − 1) F N total N H max ≤ c N ≤ ( N − 1) F N total N H min , lim N →∞ c N = 1 . Hence, lim N →∞ c N = 1 . W e next show that the total expected VCG payment con ver ges to the total expected Lagrange payment when N goes to infinity . T o calculate λ N ( t ) , we solve the fol- lowing one-step problem to determine its Lagrange multiplier: max U ( t ) X T ( t ) QX ( t )+ U T ( t ) RU ( t )+ E X T ( t + 1) P t +1 X ( t + 1) subject to 1 T U ( t ) = 0 . (26) where P t is the Ricatti matrix of the unconstrained prob- lem where balance constraint 1 T U ( t ) = 0 , or u 1 ( t ) = − P N i =2 u i ( t ) is substituted in both the objectiv e and the state equation. The Lagrangian is, L = X T ( t ) QX ( t ) + U T ( t ) RU ( t ) + E X T ( t + 1) P t +1 X ( t + 1) − λ N ( t )1 T U ( t ) T ake partial deriv ative with respect to U ( t ) and λ ( t ) , we have ∂ L ∂ U ( t ) =2 RU ( t ) + 2 B T P t +1 B U ( t ) + 2 B T P t +1 AX ( t ) − λ N ( t )1 = 0 . The Lagrange multiplier λ N ( t ) calculated from (26) is: λ N ( t ) = 2 1 T ( R + B T P t +1 B ) − 1 1 − 1 · 1 T ( R + B T P t +1 B ) − 1 B T P t +1 AX ( t ) := Φ t X ( t ) . (27) At time s , we denote λ N ( s, t ) as the Lagrange multipliers associated with the balance constraint 1 T U ( s, t ) = 0 . From Theorem 4, we have lim N →∞ " T − 1 X t = s λ N ( s, t ) u ∗ N i ( s, t ) ! − p N i ( s ) # = 0 . Summing ov er s , we have lim N →∞ T − 1 X s =0 " T − 1 X t = s λ N ( s, t ) u ∗ N i ( s, t ) ! − p N i ( s ) # = lim N →∞ T − 1 X t =0 " t X s =0 λ N ( s, t ) u ∗ N i ( s, t ) ! − p N i ( t ) # = 0 . From (25), we have, at time s , λ N ( s, t ) = Φ t s X τ =0 X ( τ , t ) , and at time s − 1 , λ N ( s − 1 , t ) = Φ t s − 1 X τ =0 X ( τ , t ) . Therefore, λ N ( s, t ) = λ N ( s − 1 , t ) + Φ t X ( s, t ) . The Lagrange multiplier λ N ( t ) associated with the balance constraint 1 T U ( t ) = 0 can be calculated as: λ N ( t ) = Φ t X ( t ) = Φ t t X s =0 X ( s, t ) = λ N ( t, t ) . As a result, λ N ( t ) u ∗ N i ( t ) = λ N ( t ) t X s =0 u ∗ N i ( s, t ) = t X s =0 " λ N ( s, t ) + Φ t t X τ = s +1 X ( τ , t ) ! u ∗ N i ( s, t ) # Because X (0) is independent of W ( t ) and W ( t ) are i.i.d., E [ X ( τ , t ) u ∗ N i ( s, t )] = 0 , for τ ≥ s + 1 . Therefore, lim N →∞ E T − 1 X t =0 λ N ( t ) u ∗ N i ( t ) − p N i ( t ) = lim N →∞ E T − 1 X t =0 " t X s =0 λ N ( s, t ) u ∗ N i ( s, t ) ! − p N i ( t ) # = 0 . D. Numerical Example In this section, we provide a numerical example of the proposed Scaled-VCG scheme. For power generators, u i ( t ) is the power generated at time t . The state equation is: x i ( t + 1) = x i ( t ) + u i ( t ) + w i ( t ) , where w i ( t ) is the noise. The utility function is defined as the negati ve cost: F i ( u i ( t )) = r i u 2 i ( t ) + v i u i ( t ) , where r i < 0 and v i > 0 are constants. For power consumers, we adopt the virtual battery model [41] to represent aggregate load, and let x ( t ) and u ( t ) denote the state of char ge (SoC) and power consumption at time t , respectively . The state equation is: x i ( t + 1) = a i x i ( t ) + b i u i ( t ) + η i h i ( t ) + w i ( t ) , where a i , b i and η i are constants, h i ( t ) represents ambient temperature, and w i ( t ) is the noise. The utility function is: F i ( x i ( t ) , u i ( t )) = q i ( x i ( t ) − ¯ x i ) 2 , where q i < 0 is a constant number , and ¯ x i is the desired SoC. Let T = 4 . The parameters are drawn from the following uniform distributions: r i ∼ U ( − 0 . 02 , − 0 . 01) , v i ∼ U (40 , 50) , a i ∼ U (0 . 99 , 1 . 01) , b i ∼ U (0 . 98 , 1 . 02) , η i ∼ U (3 . 5 , 4) , q i ∼ U ( − 1 . 01 , − 0 . 99) . w i ( t ) are drawn from Gaussian distribution N (0 , 0 . 2) . For generators, x i (0) = 0 ; for consumers, x i ( t ) ∼ N (75 , 1) and h i = [75 , 80 , 90 , 75] T , ∀ i . W e first in vestigate the issue of incentive compatibility in a 4-agent system with agents 1 and 2 as generators, and agents 3 and 4 as loads. W e assume that agents 2, 3, and 4 alw ays bid the true states while agent 1 may adopt dif ferent strategies when bidding its state. W e compare the expected net utility for agent 1 when the following 3 strate gies are adopted: 1) Always bids the state ˆ x i ( s, s ) = w i ( s − 1) − 0 . 1 . 2) Always bids the true state ˆ x i ( s, s ) = w i ( s − 1) . 3) Always bids the state ˆ x i ( s, s ) = w i ( s − 1) + 0 . 1 . 15 Strategy 1 2 3 Expected net utility 25.4 27.8 25.2 T ABLE I: Comparison of expected net utility when adopting different strategies 2 4 6 8 10 12 14 16 N -0.2 0 0.2 0.4 0.6 0.8 1 Fig. 1: Con ver gence of c N and d N The results are summarized in T able I. It can be seen that truth-telling of state results in a higher expected net utility compared to other non-truth-telling strategies. W e next in vestig ate the asymptotic performance of the SVCG mechanism. For each N , c N is chosen as the optimal solution to the MinMax problem as shown in Section V -C. Denote d N = E P T − 1 t =0 λ N ( t ) u N i ( t ) − p N i ( t ) . Results are shown in Fig. 1. It is seen that as N increases, c N and d N con verge to 1 and 0 respectiv ely . E. LQG systems with partially observed states The above results can be extended in a straightforward man- ner to the partially observ ed case where a linear transformation of the state is observed under additive white Gaussian noise. This can be done by considering the hyperstate, the conditional distribution of the state, which ev olves as a linear system driv en by the innov ation process. In this case, the state noise corresponds to the innov ation process, which is effecti vely what the agents are being asked to bid. V I . L I N E A R N O N - G A U S SI A N A G E N T S W I T H Q UA D R A T I C C O S T S Consider now the case where the agents are linear and still hav e quadratic costs, but their state noises, while independent, are not Gaussian. Definition 8. W e say that a mechanism achiev es “linear efficienc y” if it attains the best cost that can be obtained through linear feedback. The mechanism for the LQG case also achie ves linear efficienc y when the noises are zero mean and white, but non- Gaussian. This result rests on the fact that the mechanism achiev es intertemporal decoupling. Theorem 8. The SVCG mechanism can ensur e incentive compatibility and linear efficiency . Pr oof. The proof follows the the proof of Theorem 5 for LQG since the crucial idea there is to obtain uncorrelatedness as in Lemma 2, which follows from the linear control law . In general, howe ver , nonlinear strategies may achiev e e ven lower cost, but the SVCG mechanism can only guarantee the optimal cost in the class of linear feedback laws. V I I . C O N C L U D I N G R E M A R K S It remains an open problem if it is possible to construct a mechanism that ensures the dominance of dynamic truth- telling for agents comprised of general stochastic dynamic systems. W e conjecture that it is not feasible in general. A careful construction of a sequence of layered VCG payments ov er time shows that for the special case of rational LQG agents with known system parameters, the intertemporal effect of current bids on future payoffs can be decoupled, and truth-telling of dynamic states is guaranteed. It achieves sub- game perfect dominance of truth telling strategies and social welfare optimality . A modification of the VCG payments, called Scaled-VCG, achie ves Budget Balance and Individual Rationality for a range of scaling factors, under a certain Market Po wer Balance condition. This condition provides economic justification for Load Serving entities or Load Ag- gregators that group small consumers as a means for achieving social welfare optimality . If the noises are not Gaussian, then the mechanism achiev es optimal social welfare in the class of linear strate gies. In the asymptotic regime of increasing population of agents, the Scaled-VCG payments con ver ge to the Lagrange payments, the payments that the agents would make in the absence of strategic considerations. It is of interest to determine the viability of the ISO identifying a range of the scaling constant that assures budget balance and individual rationality . It is also of interest to design incentive mechanisms or identify conditions under which there is no strategic play on the scaling constants. and to extend the current layered mechanism to the case where the ISO does not kno w the system parameters. AC K N OW L E D G M E N T The authors would like to thank Le Xie and V ijay Subra- manian for their valuable comments and suggestions. R E F E R E N C E S [1] W . V ickrey , “Counterspeculation, Auctions, and Competitiv e Sealed T enders, ” The Journal of Finance , vol. 16, no. 1, pp. 8–37, 1961. [2] T . Groves, “Incentives in T eams, ” Econometrica , vol. 41, no. 4, pp. 617–631, 1973. [3] J. R. Green and J.-J. Laf font, Incentives in Public Decision Making . Amsterdam: North-Holland, 1979. [4] D. Bergemann and M. Said, “Dynamic Auctions: A Surve y , ” Cowles Foundation for Research in Economics, Y ale University , Cowles Foun- dation Discussion Papers 1757R, 2010. [5] D. Bergemann and J. V alimaki, “Dynamic Mechanism Design: An Introduction, ” Cowles Foundation for Research in Economics, Y ale Univ ersity , Cowles Foundation Discussion Papers 2102, Aug. 2017. [6] A. Kalbat, “Linear Quadratic Gaussian (LQG) Control of Wind T ur- bines, ” in 2013 3r d International Conference on Electric P ower and Ener gy Con version Systems , Oct 2013, pp. 1–5. [7] A. D. Wright, “Modern Control Design for Flexible W ind Turbines, ” 7 2004. 16 [8] ISO New England, “Day-Ahead and Real-Time Energy Markets, ” http: //www .iso- ne.com/markets- operations/markets/da- rt- energy- mark ets. [9] D. Bergemann and J. V alimaki, “Dynamic Marginal Contribution Mech- anism, ” Cowles Foundation for Research in Economics, Y ale University , Cowles Foundation Discussion Papers 1616, Jul. 2007. [10] S. Athey and I. Segal, “An Efficient Dynamic Mechanism, ” Economet- rica , vol. 81, no. 6, pp. 2463–2485, 2013. [11] C. d’Aspremont and L.-A. Gerard-V aret, “Incentives and Incomplete Information, ” J ournal of Public Economics , vol. 11, no. 1, pp. 25 – 45, 1979. [12] A. Pav an, I. Segal, and J. T oikka, “Dynamic Mechanism Design: Incentiv e Compatibility , Profit Maximization and Information Disclo- sure, ” Evanston, Discussion Paper , Center for Mathematical Studies in Economics and Management Science 1501, 2009. [13] J. A. Mirrlees, “An Exploration in the Theory of Optimum Income T axation, ” The Review of Economic Studies , vol. 38, no. 2, pp. 175– 208, 1971. [14] R. Ca vallo, D. C. Parkes, and S. P . Singh, “Optimal Coordinated Planning Amongst Self-Interested Agents with Priv ate State, ” CoRR , vol. abs/1206.6820, 2012. [15] A. Bapna and T . A. W eber, “Efficient Dynamic Allocation with Uncer - tain V aluations, ” 2005. [16] D. C. Parkes and S. P . Singh, “An MDP-Based Approach to Online Mechanism Design, ” in Advances in Neural Information Pr ocessing Systems 16 , S. Thrun, L. K. Saul, and B. Sch ¨ olkopf, Eds. MIT Press, 2004, pp. 791–798. [17] E. J. Friedman and D. C. Parkes, “Pricing W iFi at Starbucks: Issues in Online Mechanism Design, ” in Proceedings of the 4th A CM Confer ence on Electr onic Commer ce , ser . EC ’03. New Y ork, NY , USA: A CM, 2003, pp. 240–241. [18] D. Besanko, “Multi-period Contracts Between Principal and Agent with Adverse Selection, ” Economics Letters , vol. 17, no. 1, pp. 33 – 37, 1985. [19] M. Battaglini and R. Lamba, “Optimal Dynamic Contracting, ” Princeton Univ ersity , Department of Economics, Econometric Research Program., W orking Papers 1431, Oct. 2012. [20] D. Bergemann and A. Pav an, “Introduction to Symposium on Dynamic Contracts and Mechanism Design, ” Journal of Economic Theory , vol. 159, pp. 679 – 701, 2015, symposium Issue on Dynamic Contracts and Mechanism Design. [21] P . G. Sessa, N. W alton, and M. Kamgarpour , “Exploring V ickrey-Clarke- Groves Mechanism for Electricity Markets, ” CoRR , vol. abs/1611.03044, 2016. [22] Y . Okajima, T . Murao, K. Hirata, and K. Uchida, “Integration of Day- ahead Energy Market using VCG type Mechanism under Equality and Inequality Constraints, ” in 2015 IEEE Conference on Contr ol Applica- tions (CCA) , Sept 2015, pp. 187–194. [23] Y . Xu and S. H. Low , “An Efficient and Incentive Compatible Mecha- nism for Wholesale Electricity Markets, ” IEEE T ransactions on Smart Grid , vol. 8, no. 1, pp. 128–138, Jan 2017. [24] S. Bistarelli, R. Culmone, P . Giuliodori, and S. Mugnoz, “Mechanism Design Approach for Energy Ef ficiency, ” CoRR , vol. abs/1608.07492, 2016. [25] P . Samadi, R. Schober, and V . W . S. W ong, “Optimal Energy Consump- tion Scheduling using Mechanism Design for the Future Smart Grid, ” in 2011 IEEE International Confer ence on Smart Grid Communications (SmartGridComm) , Oct 2011, pp. 369–374. [26] J. A. T aylor, A. Nayyar, D. S. Callaway, and K. Poolla, “Dynamic Pricing in Consolidated Ancillary Service Markets, ” in 2013 European Contr ol Confer ence (ECC) , July 2013, pp. 3032–3037. [27] D. C. Parkes, J. Kalagnanam, and M. Eso, “Achieving Budget-balance with V ickrey-based Payment Schemes in Exchanges, ” in Pr oceedings of the 17th International Joint Confer ence on Artificial Intelligence - V olume 2 , ser . IJCAI’01. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2001, pp. 1161–1168. [28] H. Moulin and S. Shenker, “Strategyproof Sharing of Submodular Costs: Budget Balance versus Efficiency, ” Economic Theory , vol. 18, no. 3, pp. 511–533, Nov 2001. [29] R. Cav allo, “Optimal Decision-Making W ith Minimal W aste: Strate- gyproof Redistribution of VCG Payments, ” in Pr oc. of the 5th Int. Joint Conf. on Autonomous Agents and Multi Agent Systems (AAMAS’06) , Hakodate, Japan, 2006, pp. 882–889. [30] D. Thirumulanathan, H. V inay , S. Bhashyam, and R. Sundaresan, “ Almost Budget Balanced Mechanisms with Scalar Bids for Allocation of a Divisible Good, ” Eur opean Journal of Operational Researc h , vol. 262, no. 3, pp. 1196 – 1207, 2017. [31] J. Ma, J. Deng, L. Song, and Z. Han, “Incentive Mechanism for Demand Side Management in Smart Grid Using Auction, ” IEEE T ransactions on Smart Grid , vol. 5, no. 3, pp. 1379–1388, May 2014. [32] O. Karaca and M. Kamgarpour , “Core-Selecting Mechanisms in Elec- tricity Markets, ” CoRR , vol. abs/1811.09646, 2018. [33] T . T anaka, N. Li, and K. Uchida, “On the Relationship between the VCG Mechanism and Market Clearing, ” in 2018 Annual American Control Confer ence (ACC) , June 2018, pp. 4597–4603. [34] E. M. Azev edo and E. Budish, “Strategy-proofness in the Large, ” Na- tional Bureau of Economic Research, W orking Paper 23771, September 2017. [35] R. Singh, P . R. Kumar , and L. Xie, “Decentralized Control via Dynamic Stochastic Prices: The Independent System Operator Problem, ” IEEE T ransactions on Automatic Contr ol , vol. PP , no. 99, pp. 1–1, 2018. [36] K. Ma and P . R. Kumar, “The Strategic LQG System: A Dynamic Stochastic VCG Framework for Optimal Coordination, ” in 2018 IEEE 57th Annual Conference on Decision and Contr ol (CDC) , Dec 2018. [37] H. Li and L. T esfatsion, “ISO Net Surplus Collection and Allocation in Wholesale Po wer Markets under LMP, ” IEEE T ransactions on P ower Systems , vol. 26, no. 2, pp. 627–641, 2011. [38] G. B. Alderete, “ Alternative Models to Analyze Market Power and Fi- nancial T ransmission Rights in Electricity Markets, ” Ph.D. dissertation, Univ ersity of W aterloo, 2005. [39] M. Renardy and R. Rogers, An Introduction to P artial Differ ential Equations . Springer -V erlag GmbH, 2004. [40] P . R. Kumar and P . V araiya, Stochastic Systems: Estimation, Identifica- tion, and Adaptive Control . Prentice Hall Engle wood Cliffs, NJ, 1986, vol. 986. [41] L. Zhao and W . Zhang, “A Geometric Approach to V irtual Battery Modeling of Thermostatically Controlled Loads, ” in 2016 American Contr ol Confer ence (ACC) , July 2016, pp. 1452–1457. Ke Ma receiv ed the B.E. degree in automation from Tsinghua Univ ersity , Beijing, China, in 2012, and the Ph.D. degree in electrical and computer engineering from the Department of Electrical and Computer Engineering, T exas A&M Uni versity , Col- lege Station, TX, USA in 2018. He is currently an electrical engineer at the Op- timization and Control Group, Pacific Northwest National Laboratory (PNNL), Richland, W A, USA. His research interests include dynamic mechanism design and its application in electricity market, and market-based (transactiv e) coordination and control of distributed energy resources (DERs). P . R. Kumar B. T ech. (IIT Madras, ‘73), D.Sc. (W ashington Univ ersity , St. Louis, ‘77), was a fac- ulty member at UMBC (1977-84) and Uni v . of Illinois, Urbana-Champaign (1985-2011). He is cur- rently at T exas A&M Univ ersity . His current re- search is focused on cyberphysical systems, cyberse- curity , priv acy , wireless networks, renewable energy , power system, smart grid, autonomous vehicles, and unmanned air vehicle systems. He is a member of the US National Academy of Engineering, The W orld Academy of Sciences, and the Indian National Academy of Engineering. He was awarded a Doctor Hon- oris Causa by ETH, Zurich. He has received the IEEE Field A ward for Control Systems, the Donald P . Eckman A ward of the AA CC, Fred W . Ellersick Prize of the IEEE Communications Society , the Outstanding Contribution A ward of A CM SIGMOBILE, the Infocom Achievement A ward, and the SIGMOBILE T est-of-Time Paper A ward. He is a Fellow of IEEE and ACM Fellow . He was Leader of the Guest Chair Professor Group on Wireless Communication and Networking at Tsinghua University , is a D. J. Gandhi Distinguished V isiting Professor at IIT Bombay , and an Honorary Professor at IIT Hyderabad. He was awarded the Distinguished Alumnus A ward from IIT Madras, the Alumni Achiev ement A ward from W ashington Univ ., and the Daniel Drucker Eminent Faculty A ward from the College of Engineering at the Univ . of Illinois.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment