15
1 Decentralized Control via Dynamic Stochastic Prices: The Independent System Operator Problem Rahul Singh, Student Member, IEEE, P. R. Kumar, Fellow, IEEE, and Le Xie, Senior Member, IEEE Abstract—A smart grid connects several electricity con- sumers/producers, e.g., wind/solar/storage farms, fossil- fuel plants, industrial/commercial loads, or load-serving aggregators, all modeled as stochastic dynamical systems. In each time period, each consumes/supplies some electrical energy. Each such agent’s utility is the benefit accrued from its consumption or the negative of its generation cost. The social welfare, the sum of all these utilities, is the total benefit accrued from all consumption minus the total cost of generation. The Independent System Operator is charged with maximizing the social welfare subject to total generation equalling consumption in each time period, but without the agents revealing their system states, dynamic models, utility functions or uncertainties. It has to announce prices after interacting with agents via bid-price interactions. This paper examines the case where the agents respond in a compliant price-taking manner. It is shown that there is an iterative bid-price interaction, where agents respond to price announcements by complying to the requirement to announce their optimal responses according to their true stochastic model, or, in the case where the agents are LQG systems, according to determinis- tic versions of their true stochastic models, that leads to the same global maximum value of social welfare attainable if all agents had pooled their information. In the important LQG case, the bid-price iteration is dramatically simple, exchanging only real-valued vectors of future prices and consumptions/generations at each time step. Agents need not even know of the existence of other agents. DC Power Flow Equations can also be incorporated. The results may be of broader interest via-a-vis general equilibrium theory of economics for stochastic dynamic agents. Index Terms—Independent System Operator, Energy Mar- ket, General Equilibrium Theory, Renewable Energy, Power Systems, Social Welfare, Demand Response, Decentralized Stochastic Control. I. I NTRODUCTION I N the electricity grid, the power generated should equal the power consumed at all times, neglecting line losses. Unlike other commodities, electricity cannot be stored in the grid. This task of balancing them, and in the most R. Singh is at 32-D716, LIDS, MIT, Cambridge, MA 02139; P. R. Kumar and Le Xie are at Dept. of ECE, Texas A&M Univ., 3259 TAMU, College Station, Tx 77843-3259). [email protected], prk,[email protected]. Preferred address for correspondence: P. R. Kumar, Dept. of ECE, Texas A&M Univ., 3259 TAMU, College Station, Tx 77843-3259. This material is based upon work partially supported by NSF under Contract Nos. ECCS-1546682, NSF Science & Technology Center Grant CCF- 0939370, NSF ECCS-1150944 and DGE-1303378. economical way, is entrusted to the Independent System Operator (ISO) in deregulated electricity markets [1]. Energy is increasingly obtained from uncertain, dy- namically varying renewables such as wind/solar. In the future, besides controlling traditional sources such as coal plants, demand will also be continually controlled to some extent to balance generation and consumption. Both gen- erators and loads are dynamical systems that may posses thermal or other inertias. Both are also subject to uncer- tainties, such as wind, cloud cover, ambient temperature, or customer traffic. The problem we address is how the ISO can perform its task in the emerging scenario where loads and generators are stochastic dynamic systems. The primary mechanism for coordinating entities is through time-varying stochastic prices. However, entities will need to know the probability distribution of future prices to plan their optimal consumption/generation over time. Future prices in turn depend on the dynamics of all agents and their future uncertainties. However, an entity is generally unaware of the dynamics or uncertainties of other agents or how they will respond, and may not wish to explicitly reveals the details of its dynamic model or utility function to the ISO. The problem we address is that of choosing generations/consumptions to achieve the same social welfare optimum that could have been attained with full state and model information of all agents, but through only a constrained price-bid frame- work between agents and the ISO. The role played by price in coordinating agents was initiated by Walras [2] in general equilibrium theory. Arrow and Debreu [3]–[5] showed that a correct choice of prices for commodities ensures that for quasi-concave utility functions, a system of individual entities, where each optimizes its own response given prices, results in a systemwide Pareto optimal Subsequently, Arrow, Block and Hurwicz [6] showed that the prices can be discovered by Walrasian tatonnement [2] under appropriate condi- tions such as gross substitutes. Their theory extends to allow for uncertainty by simply considering each good under a different random state of nature as a different good, as shown by Arrow [7]. Subsequently, Radner [8] has shown the existence of prices corresponding to an equilibrium even if different agents have different ran- dom observations of the uncertainties. However, if agents communicate, the problem is more complicated. Radner emphasizes tractability, since schemes may require un-

Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

1

Decentralized Control via Dynamic StochasticPrices: The Independent System Operator

ProblemRahul Singh, Student Member, IEEE, P. R. Kumar, Fellow, IEEE, and Le Xie, Senior Member, IEEE

Abstract—A smart grid connects several electricity con-sumers/producers, e.g., wind/solar/storage farms, fossil-fuel plants, industrial/commercial loads, or load-servingaggregators, all modeled as stochastic dynamical systems.In each time period, each consumes/supplies some electricalenergy. Each such agent’s utility is the benefit accrued fromits consumption or the negative of its generation cost.The social welfare, the sum of all these utilities, is thetotal benefit accrued from all consumption minus the totalcost of generation. The Independent System Operator ischarged with maximizing the social welfare subject to totalgeneration equalling consumption in each time period, butwithout the agents revealing their system states, dynamicmodels, utility functions or uncertainties. It has to announceprices after interacting with agents via bid-price interactions.This paper examines the case where the agents respond ina compliant price-taking manner.

It is shown that there is an iterative bid-price interaction,where agents respond to price announcements by complyingto the requirement to announce their optimal responsesaccording to their true stochastic model, or, in the casewhere the agents are LQG systems, according to determinis-tic versions of their true stochastic models, that leads to thesame global maximum value of social welfare attainable ifall agents had pooled their information. In the importantLQG case, the bid-price iteration is dramatically simple,exchanging only real-valued vectors of future prices andconsumptions/generations at each time step. Agents neednot even know of the existence of other agents. DC PowerFlow Equations can also be incorporated.

The results may be of broader interest via-a-vis generalequilibrium theory of economics for stochastic dynamicagents.

Index Terms—Independent System Operator, Energy Mar-ket, General Equilibrium Theory, Renewable Energy, PowerSystems, Social Welfare, Demand Response, DecentralizedStochastic Control.

I. INTRODUCTION

IN the electricity grid, the power generated should equalthe power consumed at all times, neglecting line losses.

Unlike other commodities, electricity cannot be stored inthe grid. This task of balancing them, and in the most

R. Singh is at 32-D716, LIDS, MIT, Cambridge, MA 02139; P. R.Kumar and Le Xie are at Dept. of ECE, Texas A&M Univ., 3259TAMU, College Station, Tx 77843-3259). [email protected],prk,[email protected].

Preferred address for correspondence: P. R. Kumar, Dept. of ECE, TexasA&M Univ., 3259 TAMU, College Station, Tx 77843-3259.

This material is based upon work partially supported by NSF underContract Nos. ECCS-1546682, NSF Science & Technology Center GrantCCF- 0939370, NSF ECCS-1150944 and DGE-1303378.

economical way, is entrusted to the Independent SystemOperator (ISO) in deregulated electricity markets [1].

Energy is increasingly obtained from uncertain, dy-namically varying renewables such as wind/solar. In thefuture, besides controlling traditional sources such as coalplants, demand will also be continually controlled to someextent to balance generation and consumption. Both gen-erators and loads are dynamical systems that may possesthermal or other inertias. Both are also subject to uncer-tainties, such as wind, cloud cover, ambient temperature,or customer traffic. The problem we address is how theISO can perform its task in the emerging scenario whereloads and generators are stochastic dynamic systems.

The primary mechanism for coordinating entities isthrough time-varying stochastic prices. However, entitieswill need to know the probability distribution of futureprices to plan their optimal consumption/generation overtime. Future prices in turn depend on the dynamics of allagents and their future uncertainties. However, an entityis generally unaware of the dynamics or uncertaintiesof other agents or how they will respond, and may notwish to explicitly reveals the details of its dynamic modelor utility function to the ISO. The problem we addressis that of choosing generations/consumptions to achievethe same social welfare optimum that could have beenattained with full state and model information of allagents, but through only a constrained price-bid frame-work between agents and the ISO.

The role played by price in coordinating agents wasinitiated by Walras [2] in general equilibrium theory.Arrow and Debreu [3]–[5] showed that a correct choiceof prices for commodities ensures that for quasi-concaveutility functions, a system of individual entities, whereeach optimizes its own response given prices, results ina systemwide Pareto optimal Subsequently, Arrow, Blockand Hurwicz [6] showed that the prices can be discoveredby Walrasian tatonnement [2] under appropriate condi-tions such as gross substitutes. Their theory extends toallow for uncertainty by simply considering each goodunder a different random state of nature as a differentgood, as shown by Arrow [7]. Subsequently, Radner [8]has shown the existence of prices corresponding to anequilibrium even if different agents have different ran-dom observations of the uncertainties. However, if agentscommunicate, the problem is more complicated. Radneremphasizes tractability, since schemes may require un-

Page 2: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

2

bounded computation.The idea of employing prices to perform this task in

the electrical power domain was introduced by Cara-manis, Bohn and Schweppe [9] and Bohn, Caramanisand Schweppe [10]. Hogan [11] further elaborated thedetailed implementation of a locational marginal price-based electricity market operation. Fundamentally basedon a static dispatch with no uncertainty, today’s electricitymarket design and corresponding price signal are simplynot designed for achieving social welfare optimality fordynamic and uncertain generators and loads. The currentmarket mechanism requires participants to make decou-pled bids for separate time intervals. In the day-aheadmarket, a generator has to bid a price-generation curvefor the 8am-9am slot, another separate curve for the 9am-10am slot, and so on, for each hour of the next day.However, generators have ramping constraints, such as50 MW/hour, which give rise to inter-temporal constraintsbetween different time slots. These are typically handledby ad hoc out-of-market (OOM) merit order measures[12].The bidding procedure fundamentally does not allowa generator to bid a time function even though that iscritical to its operation. Even more challengingly, in thereal-time market, the bidding process does not allow aparticipant to optimize with respect to stochastic processmodels of uncertain resources. There have been manystudies on the potential problems associated with thismarket design, such as unnecessarily price volatility [13],network externalities [14], and lack of investment signals[15]. While the deterministic and static approach to ap-proximating the underlying dynamic and stochastic powersystem may be practically appealing without much loss ofoptimality in conventional systems, it is not so for demandresponse and intermittent renewables [16].

The ultimate holy grail is to to determine a biddingscheme for the strategic case where agents may be un-truthful. This paper takes a first step by examining inthe simpler case, where agents are complaint and price-takers, whether we can (i) determine how to computewhat is the “price” in real-time in the case of dynamicstochastic agents as the system evolves; (ii) determinehow compliant agents (i.e., price-takers) and ISO shouldinteract; (iii) determine whether this could be achievedin a computationally tractable manner; and (iv) if so, forwhat classes of models. The next phase to be addressed ina future paper is to determine whether and how this price-taking solution can be incentive-compatibly approximatedin the case where the agents are strategic and may lie.

The paper is organized as follows. Section II presentsthe main results. Section III formulates the ISO prob-lem. Section IV presents a preliminary modification toa straightforward bid-price iteration to satisfy energybalance. The case of general stochastic dynamic agentsis addressed in Section V. The case where all agents areLQG systems is addressed in Section VI. The LQG casewith DC power flow constraints or other linear constraintsis treated in Section VII. Section VIII presents the resultsof illustrative simulations, concluding in Section IX.

II. THE MAIN RESULTS

(i) For a system where different agents are subject to dif-ferent private uncertainties, a bid-price iteration schemebetween the agents and the ISO, repeated at every timeinstant, where the ISO employs an averaging procedure todetermine allocations, does achieve the global maximumof the social welfare that would have been attainablewere all agents pooling their information (Theorem 2).Such bid-price interaction between the agents and theISO, or some form of “two-way exploration” by the ISO,appears to be needed to be able to assign optimal con-sumptions/generations, since even exchanging supply anddemand functions and knowledge of optimal prices, with-out any two-way interaction, is inadequate even in a staticdeterministic system to determine optimal allocation ofconsumptions/generations or even assure energy balance,when utility functions are concave but not necessarilystrictly concave. (Example 1).

This result is related to the work of Radner [8], whichextends the work of Arrow [7] and Debreu [17] to thecase of general information structures. Under conditionson the preference functions of agents similar to thoseinvestigated by Arrow-Debreu, he shows that there existequilibrium prices under which all agents can choose op-timal actions, with the resulting optimal actions attainingbalance between supply and demand. Radner does notspecifically address how these equilibrium prices are tobe obtained. In the second portion of Radner’s work in[8], he formulates the problem where agents can sendmessages to each other during the evolution of the system.In our work, we aim not at the particular equilibriumthat would result from the private information structure,but at the social welfare optimal solution that could beattained if all agents had pooled their information andmade it available to all. To achieve this, we specificallydelve into what sorts of “messages” (using the languageof Radner) need to be exchanged between agents in orderto achieve this particular social welfare optimal solution.Of specific interest is how this can be done throughthe restrictive medium of an ISO, and not by arbitrarybilateral exchanges of information between agents, withthe interaction between the ISO and the agents beingprice announcements and consumption/generation bids.This problem involves the problem of recovering primaloptimal solution from dual optimal prices. We exploitrecent results from [18] that employ averaging of iteratesto obtain a primal solution that is feasible, in this case,satisfying energy balance, and optimal. This requires aninteractive scheme of exploratory price announcementsby the ISO, and the agents responding to the ISO in aprice-taking fashion. We thereby provide a solution thatinvolves a bid-price interaction between agents and theISO at each time step and attains the goal of maximizingsocial welfare. We note that we do not make a “non-satiation” assumption, as in Radner.(ii) In the case where all agents are linear quadraticGaussian (LQG) systems with private uncertainties, and

Page 3: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

3

are price-takers compliant to the requirement to bid withrespect to a deterministic model of their systems, there isa simple bid-price iteration scheme, where each agent’sbid is only a simple function of time, which attainsthe maximal social welfare attainable when all agentspool their information (Theorem 3). This is a substantialsimplification since agents don’t have to bid for all futureuncertainty realizations. Also, it does not require any sub-sequent averaging of the bids by the ISO. We should notethat this problem falls outside work on team theory sinceunlike in team theory the agents do not know the modelsof the dynamics of other agents or their utility functions.Also, almost sure constraints, such as balancing, are nottypically considered in team problems. In his work exam-ining general information structures, Radner emphasizesthat in the case of general information structures thesolutions may require unbounded computation. The LQGproblem may therefore be of interest as a special case ofsome application interest where the solution is tractable.

The social welfare optimality of the simple bid-priceiteration scheme continues to hold if there are multiplelinear equalities besides energy balance (Theorem 4). Inparticular it holds for the case where the power flowshave to be transmitted over an electrical network modeledby the DC Power Flow equations that approximate thenonlinear AC Power Flow equations (Theorem 4). Giventhe widespread usage of the LQG modeling framework,this potentially provides an implementable scheme. Thisis illustrated through simulation examples in Section VIII.

III. THE SYSTEM MODEL AND THE ISO PROBLEM

We consider a smart grid consisting of M agents, eachof which may act as a producer, consumer or both, i.e., aprosumer, evolving over a time interval t = 0, 1, . . . , T−1.Agents are modeled as stochastic dynamical systems.The state xi(t) of agent i at t is known to it, andevolves as xi(t+ 1) = fi(xi(t), ui(t), wi(t), wc(t), t), wherefi describes the dynamics of the agent i. The initialcondition xi(0) can be random. The special case ofparticular interest is where ui(t) is a scalar that denotesthe amount of electricity consumed (negative if supplied)from the grid by agent i at time t, but we allow it tobe a vector of several commodities being produced andconsumed. wc(ω, t), 0 ≤ t ≤ T − 1 (defined on someprobability space (Ω,F ,P)) is a “common uncertainty”(e.g., temperature of a city) that affects and is observedby all the agents, while wi(ω, t), 0 ≤ t ≤ T − 1 isa “private” uncertainty that specifically affects onlyagent i, e.g., wind at a particular wind farm, and isknown causally only to agent i. As we will see, thisdecomposition of uncertainties clarifies the task ofconstructing interaction schemes between the agents andISO. We suppose without loss of generality that wc(·)and wi(·) for 1 ≤ i ≤ M are all i.i.d. processes witheach random variable uniformly distributed on somealphabet W, and that are also independent of each other.The function fi(·, t) allows us to model dependent and

time-varying stochastic process uncertainties as beingformed out of these primitive i.i.d. Unif(W) uncertainties.Remark 1: The time horizon T could be 96, correspondingto one day of 15 minute slots in the real-time market.However, there is considerable flexibility to incorporateother scenarios. For example, one can model the risk-limited dispatch of [19] where purchases of forwardenergy are made for blocks of time, with blocks gettingshorter as operations approach real time. The uniformdistribution over W is not essential; any distribution isfine and it could even be time varying, as long as it isknown to all agents; wi(t) could be the “innovationsprocess” of the uncertainty process of agent i.Consumption/generation constraints. Letwti := (wi(0), wi(1), . . . , wi(t) denote the past of wi,and similarly define wtc. Agent i’s choice has to satisfythe local capacity constraints Fi(w

ti , w

tc, t)ui(t) ≤

gi(wti , w

tc, t) +

∑t−1s=0 Ji(w

ti , w

tc, s, t)ui(s) and

hi(wti , w

tc, t, ui(t)) ≤ 0 for each t, almost surely. The

affineness of the former constraints in past ui’s can beused to model, for example, constraints on the rate oframping up of coal plant output, the dependence on tallows for seasonality, and the dependence on wi, wcallows random effects on capacity.The one-step cost function of an agent i, 1 ≤ i ≤ M ,denoted ci(xi, ui, t) (or its negative, the one-step utilityfunction −ci(xi, ui, t)), is a function of its state andaction, in period t. For producers, this could be the costof labor or coal. For consumers, this could representthe cost incurred due to the high temperature of ahouse/business facility, or due to a delay in performinga task resulting from inadequate purchase of electricity,or the negative of some benefit of the electricity usage.Pigovian taxes may be imposed on negative externalitiessuch as pollution [20], e.g., a carbon tax, with one-steptax ei(ui, t). By allowing ei(ui, t) to be positive/negative,we can also allow for cross-subsidies to be incorporated.Energy balance should be maintained in each period,i.e.,

∑Mi=1 ui(t) = 0 for all t = 0, 1, . . . , T − 1 almost

surely. We will allow even more general linear vectorconstraints:

∑Mi=1Ki(t)ui(t) = d(t) almost surely, which

can be used to capture other constraints, e.g., DC powerflow constraints as is done in Section VII, or budgetbalance under linear cross-subsidies. As earlier, theconstraints are to hold almost surely.Remark 2: Beyond balancing generation andconsumption, the ISO also needs to model the electricaltransmission network governed by algebraic equationsbased on Kirchoff’s laws that have to be satisfied bythe voltage and current magnitudes and phase angles.They impose constraints on ui(t), 1 ≤ i ≤ M. Theycan be modeled through additional linear constraints ifone employs an approximation to the AC Power Flowequations called DC Power Flow equations. For linearlevies, ei(ui(t), t) := ei(t)ui(t), ISO budget balance,∑Mi=1 ei(t)ui(t) = 0, can be incorporated as an additional

linear constraint, if desired.

Page 4: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

4

Knowledge available to the agents and the ISO. Inthe general case, we assume that all agents and the ISOknow the alphabet W. Each agent also knows its ownone-step cost function and its own constraints. In thespecial case of LQG systems in Section VI we will onlyassume that each agent only knows its own LQG system.Remark 3: These are severe constraints on the informationavailable to the ISO. It does not know the states, dynamicmodels, or utility functions of individual agents. Agentsmay be averse to disclosing information for competitivereasons or to ensure privacy.1 Moreover, even if allagents were willing to share all their information withthe ISO, it would be such an intractably large amountof information, amounting to a complete state of theworld, that the ISO would not be able to handle it withacceptable complexity and delay anyway.The Independent System Operator (ISO) solicitselectricity purchase/sale bids from the agents in eachtime slot t = 0, 1, . . . , T − 1, and announces prices.Our model allows for agents and ISO to iterate on thebids. Once the price iterations have converged, the ISOdeclares the market clearing prices, and the electricalenergy to be consumed/generated by the agents.Bidding schemes allow the ISO and agents to reacha solution for prices and generations/consumptions.Depending on the assumptions made about the systemmodel, there will be different bidding schemes. Anexample is the following. Consider time s. The ISOannounces a price sequence for current and future timess ≤ t ≤ T −1, or future events, to all agents. Agent i bidsthe amount of electricity it is willing to purchase/generateat the current and future times s ≤ t ≤ T −1, or at futureevents, at the prices indicated by the ISO. After collectingthe bids, the ISO updates the prices. An iteration of priceupdates followed by bid updates continues till the pricesand the bids converge, and then the ISO announces theallocations of generations/consumptions to agents forthe current period s. This entire process can be repeatedin each discrete time slot s in real-time.Total system operating cost, the negative ofthe social welfare, is the sum of the expectedvalue of the finite horizon total of the one-stepcosts incurred by all the agents plus any taxes,E∑M+1i=1

∑T−1t=0 [ci (xi(t), ui(t), t) + ei (ui(t), t)]. It

is the total electricity generation cost plus anytaxes assessed minus the utility provided to theconsumers. The expectation above is taken withrespect to the combined uncertainty or “noise”process w(t) := (wc(t), w1(t), w2(t), . . . , wM (t)) fort = 0, 1, . . . , T − 1, consisting of all the privateuncertainties and the common uncertainties, as wellas the random initial conditions of all the M agents.Remark 4: The utility of a load is the “benefit” that theload accrues from the consumed power. The utility ofa generator is the negative of its cost of generation.

1Interestingly, privacy is nowhere mentioned in the seminal paper [9]that introduced spot prices, indicative of how new issues arise.

The total of all agents’ utilities, called social welfare, istherefore the benefit of the power consumed minus thecost of generating it.Goal of social welfare maximization: Let Ft be theσ-algebra generated by all the noises upto time t, privateand common, as well as all initial conditions. Thisrepresents the complete information available to allagents. If u(t) is allowed to be adapted to Ft, we callit the full state information case. Now we come to thestringent goal of this paper. We would like to determinebidding schemes that attain the same maximum value ofthe social welfare as could be attained in the full stateinformation case.

The resulting ISO Problem is:

minET−1∑t=0

M∑i=1

[ci (xi(t), ui(t), t) + ei (ui(t), t)] (1)

such that xi(t+ 1) = fi(xi(t), ui(t), wi(t), wc(t), t) a.s.;

with capacity constraints hi(wti , wtc, t, ui(t)) ≤ 0 a.s.,

(2)

Fi(wti , w

tc, t)ui(t) ≤ gi(wti , wtc, t)

+

t−1∑s=0

Ji(wti , w

tc, s, t)ui(s) a.s., ; (3)

M∑i=1

Ki(t)ui(t) = d(t) a.s. for 0 ≤ t ≤ T − 1. (4)

The central issue is the following: How should the ISOdetermine pricing and allocations to dynamic stochasticagents so that the overall system is as optimal as it couldbe in the full state information case, even though agents donot know each other’s states, constraints, dynamic models orcost functions, and neither does the ISO? Due to the lackof knowledge of other agents’ dynamic models or costfunctions, this problem falls outside of usual stochasticcontrol. Though written as an optimization problem in(1-21), it is not a standard one since the agents do evenknow the quantities involved in the optimization problem.Remark 5: A more general formulation of the problemwould take into consideration the power flows in the elec-trical network connecting the nodes. These power flowsare determined by the AC Power Flow Equations [21],an approximation of which leads to the so-called DCPower Flow equations which are linear equations thatare typically used in today’s market clearing models [21].These power flows can be incorporated by adding a setof linear constraints as in (21) at each node. The resultsof the paper extend to this situation. For brevity we onlyillustrate how this generalization proceeds in the specialcase of LQG systems in Section VII.Remark 6: There have been many efforts since the dereg-ulation of the electricity sector on a market-based frame-work to clear the system. Today’s locational marginalprice-based nodal market design is based on seminal workin [10], [11]. This has been followed up by a large bodyof literature focusing on designing a transmission pricing

Page 5: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

5

mechanism in support of an efficient market [22] [23].The naive belief that deregulation of electricity industrywould simply work was critically re-assessed following theEnron crisis and lack of long-term investment incentives[24] [25]. There has been pioneering work on gametheoretic approaches to modeling the market power issuesin the electricity market [26] [27]. With increasing pen-etration of stochastic resources, there have been effortsat designing a market bidding mechanism that achievesthe social welfare optimum. Ilic et al. [28] have pro-posed a two-layered approach that internalizes individualconstraints of market participants while allowing the ISOto manage the spatial complexity. References [29], [30]contain some heuristic approaches. Reference [31] appliesprogressive hedging to deal with uncertainties on theproduction side; however the solution is centralized, anddoes not provide any theoretical guarantees. Reference[19] studies how the ISO should dispatch, i.e., purchaseenergy and call options in different markets, under fore-cast errors about future loads and renewable generation,when future decisions can mitigate current errors.

IV. A PRELIMINARY ISSUE IN PRICE-BASED COORDINATION

Even if prices are correctly determined, it turns out thatagents bidding in a price-taking manner in response to thecorrect prices need not bid solutions that lead to energybalance. This necessitates modifications to the biddingeither by the agents or by the ISO. We begin by illustratingthe difficulty through an example, and then present asolution.

Consider the simple situation where all generators andconsumers are static and deterministic:

minu1,...,uM

M∑i=1

[ci(ui) + ei(ui)], (5)

subject to: Fiui ≤ gi, hi(ui) ≤ 0 for 1 ≤ i ≤M, (6)

andM∑i=1

Kiui = d. (7)

Assumption 1:(i) ci(·), ei(·), hi(·) are convex, ui : Fiui ≤ gi, hi(ui) ≤ 0is compact, and (5,6,7) has an optimal solution.(ii) Slater’s Condition: There exists a feasible ui satisfyinghi(ui) < 0 in RelInt(Dom(ci)) ∩ RelInt(Dom(ei)).

Dualizing only the constraint (7), and denoting u :=(u1, u2, . . . , uM ), yields, respectively, the Lagrangian, Dualfunction, and optimal reward of the Dual Problem:

L (u, λ) :=

M∑i=1

[ci(ui) + ei(ui) + λTKiui]− λT d,

D(λ) := minu:Fiui≤gi,hi(ui)≤0 ∀i

L (u, λ) ,

J? := maxλ

D(λ) = D(λ?). (8)

From (ii), J? is also the optimal cost of the Primal (5).

Since D(λ) can be decomposed agent-by-agent as

D(λ) =

M∑i=1

minui: s.t. (6)

[ci(ui) + ei(ui) + λTKiui]− λT d,

the ISO can conceivably simply announce the “optimalprice” λ? per unit of power as that which attains themax in (8), along with the additional levy ei(ui) onagent i. Each agent i can then respond with either itsgeneration −u∗i or consumption u∗i that minimizes its“net” disutility ci(ui)+ei(ui)+λ?ui over (6). The ISO canfinally announce the generation/consumption allocationsto the agents.

Since agents do not disclose their cost functions, thereneeds to be a price discovery process, as in a Wal-rasian auction [32]. The ISO’s price needs to be re-duced/increased according to whether the agents’ re-sponse results in excess total generation/consumption. Weconsider the following iterative bid-price process:

λk+1 = λk +1

k[

M∑i=1

Kiuki − d], (9)

uk+1i = argmin

ui : s.t. (6)[ci(ui) + ei(ui) + (λk+1)TKiui]. (10)

This iteration of prices2 and bids is a subgradient algo-rithm that converges to an optimal price for the Dualunder Assumption 1 [33].

However, the recovery of optimal genera-tions/consumptions from optimal price is problematic.

Example 1 (Counterexample to generation/consumptionrecovery from optimal price): Consider one generator andone load. The generator’s cost of producing −u1 units ofenergy is − 2

5u1, with u1 restricted to [−1, 0], and it isassessed a Pigovian tax − 1

10u1. The load’s utility fromconsuming u2 units of energy is log(1 + u2) with u2restricted to [0, 2]. Energy should be balanced. The socialwelfare problem is:

Min − 2

5u1 −

1

10u1 − log(1 + u2)

Subject to: − 1 ≤ u1 ≤ 0, 0 ≤ u2 ≤ 2, u1 + u2 = 0.

The optimal solution is (u?1, u?2) = (−1, 1).

The Dual function of price λ is

D(λ) = min1≤u1≤0

[−1

2u1 + λu1] + [− log(1 + u2)] + λu2].

The minimizers and minimum, (−u1(λ), u2(λ), D(λ)), are

=

(0,Min 1λ − 1, 2, 1− λ+ log λ) if λ < 1

2 ,

(any point in [0, 1], 1λ − 1, 1− λ+ log λ) if λ = 1

2 ,

(1, 1λ − 1, 12 + log λ) if 1

2 < λ ≤ 1,

(1, 0,−λ+ 12 ) if 1 < λ,

The optimal solution of the Dual is λ? = 12 .

However, when the price λ? = 12 is announced by the

ISO, the generator can bid −u1 = 0 since any point in

2The gain 1k

can be replaced by αkδ

for 12< δ ≤ 1 with α > 0 in

Sections IV and V.

Page 6: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

6

[0,1] is optimal. The load’s bid is u2 = 1, and there willnot be balance between generation and consumption. Therefore one cannot determine the optimal bids fromagents’ responses to the optimal prices. However, they canbe obtained from the iterations of the bidding process bytaking weighted averages of previous bids [18].

Theorem 1 (Determining optimal bids by generatorsand loads [18]): Consider the bid-price iteration scheme(9,10) for the problem (5,6,7) under Assumption 1.(i) Then λk → λ?, the optimal price.(ii) Let θ ≥ 0. Suppose the ISO recursively averages theobtained bids as follows:

uki =

∑k−1s=1 s

θ∑ks=1 s

θuk−1i +

kθ∑ks=1 s

θuki ; u0i = u0i . (11)

Then uki → u?i which is optimal for (5). A larger θ weights more recent values of the iterates forui more heavily, while θ = 0 takes a plain average.

Example 2 (Continued): Choosing θ = 2, one obtains:

λk : 0, 0, 1, 0.6667, 0.5416, . . .→ 1

2,

uk :

(−1

0

),

(02

),

(−0.94120.1176

),

(−0.9898

0.4133

),(

0− 0.99720.7263

),

(−0.99900.9526

), . . .→

(−1

1

).

This example illustrates the important point that knowl-edge of the optimal price alone is not sufficient to de-termine optimal generations/consumptions of agents. Infact the information gathered during the very process ofiterative bidding is itself important. The journey is asimportant as the destination. The convergence rate ofaveraging methods is a topic of active research, cf. [18].

Remark 7: This difficulty in price-based coordinationin the absence of strong convexity is also addressed inthe work of Culioli and Cohen [34]. They propose anauxiliary problem principle that appends a differentiablestrongly convex function, an approach which subsumesmany variants. Employing a quadratic auxiliary function,it yields a variant that requires each agent to add anadditional term (ui − uki )2 to the criterion minimized in(10) at each iterate. It therefore requires asking the agentsto bid according to some artificially specified cost. Incontrast, our approach allows them to make their naturalbids for generation/consumption based on the price an-nouncement they have heard, but shifts the modificationinstead to the ISO side which is required at the endto simply allocate weighted averages of previous bids.Importantly, this however entails no loss for the agentssince the weighted averages are guaranteed to be optimalresponses for them at the converged price.

V. STOCHASTIC DYNAMIC AGENTS: BID-PRICE ITERATIONS

We now turn to the general problem of interestwhere the agents are stochastic dynamic systems. De-note the combined state of the system by x(t) :=

N(3)=0 N(3)=1 N(3)=0 N(3)=1 N(3)=0 N(3)=1 N(3)=0 N(3)=1

N(2)=0 N(2)=1 N(2)=0 N(2)=1

N(1)=0 N(1)=1

Initial State x(0)

v’ v’’0.80.2

Fig. 1: A tree visualization of i.i.d. uncertainty for a systemevolving over three bid times, where the uncertaintyvalues are binary, either 0 or 1. Each sequence of noiserealizations w(0, ω), w(1, ω), . . . , w(s−1, ω) correspondsto a specific node v at the level s. Note that the red andblue subtrees with roots at nodes v′ and v′′ at the samelevel are identical due to the i.i.d. nature of the noise.

(x1(t), x2(t), . . . , xM (t)), and the combined actions byu(t) := (u1(t), u2(t), . . . , uM (t)).

A tree visualization of the system randomness, as inFig. 1, is helpful. Suppose that w(t) assumes only finitelymany values. We can then construct an uncertainty tree ofdepth T , with the root node corresponding to the initialsystem state at level 0, and each sequence of transpirednoises w(0, ω), w(1, ω), . . . , w(s−1, ω) corresponding toa specific node v at the level s.

Due to the i.i.d. nature of the noises, all the subtreeswith roots at nodes at level s and comprised of nodesbelow it, are identical. This is shown in Fig. 1 where thesubtrees of depth T −s rooted at the nodes v′ and v′′ thatare at the same level are identical.

At each time s, a sequence of iterative tentative priceannouncements by the ISO for each node at or belowthe current node at level s, followed by tentative bidsby all agents for such nodes responding optimally tothe price announcement, takes place, until they con-verge. At each iteration, the ISO revises the tentativeprice announcement to drive the “excess consumption”at each node towards zero, and agents respond optimallyaccording to their own cost-to-go function. This iterationof tentative prices and tentative bids continues till theprices converge. At that point the ISO announces andagents consume/generate the weighted average amountthey bid for the particular node occupied at time s. Thesystem then moves forward to time s + 1, arriving at arandom node at level s + 1 according to w(s), and theentire process is repeated. This is in the same fashion asModel Predictive Control.

This process is a dynamic modification of Arrow’s [3]approach of treating each “good” available at a certaintime and place as a separate good. Since agents do notknow each other’s dynamics or states or actions thereis the added critical proviso of bidding for future “time-places” by each agent, with only the current price beingactually implemented a la Model Predictive Control.Bid-Price IterationAt each time s, the agents and the ISO engage in a Bid-Price Iteration as follows:k-th Bid Update at time s: Suppose that the ISO hasdeclared a price stochastic process λks at time s thatassociates a price with each node downstream of the

Page 7: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

7

current node at level (s+1) that the system is at, where kis an index that we will use for iteration. Since neither theISO nor the agents know exactly which node they are atdue to the existence of private noises, they take advantageof the fact that all subtrees of nodes rooted at level s+ 1are identical, and simply perform these announcementsfor a tree of depth T − s, with the understanding thatthese bids apply to the subtree rooted at wherever thecurrent node is among the nodes at level s + 1. In theBid Update, each agent i, in response, chooses its bidstochastic process (uki,s(s), u

ki,s(s + 1), . . . , uki,s(T − 1)) in

response to the price stochastic process λks by solving thefollowing problem, dubbed Agent i’s Problem,

minui s.t. (2,3)

E[

T−1∑t=s

[ci(xi(t), ui(t), t) + ei(ui(t), t)

+ λks(t)TKi(t)ui(t)]|Fi,s], (12)

where Fi,t := σ(xi(0), wi(0), wi(1), . . . , wi(t −1), wc(0), wc(1), . . . , wc(t − 1)) denotes the sigma-algebra generated by agent i’s observations up to time t.It thereby generates a consumption/generation for eachnode at a lower level in the subtree with root at thepresent node that the system is at.(k+1)-th Price Update at time s: The ISO updatesthe price stochastic process in response to theagents’ bids. (We note that in (12), the variablesxi(s+ 1), xi(s+ 2), . . . , xi(T − 1) are not the actual statesof the agent at those times, but are the variables fora hypothetical problem that needs to be solved by theagent at time s in order to determine its bid at times. This hypothetical problem takes as input the pricestochastic process, including its law, as announced bythe ISO). Guided by the “excess consumption function”∑Mi=1Ki(t)u

ki,s(t) − d(t), it raises or lowers prices to

satisfy the general linear constraint, as follows:

λk+1s (t) = λks(t)+

1

k[

M∑i=1

Ki(t)uki,s(t)−d(t)], s ≤ t ≤ T−1.

(13)The averaged allocations of consumption/generation:At time s, after the prices have converged, i.e.,

λ?s(t) := limk→∞

λks(t) for s ≤ t ≤ T − 1, (14)

the ISO announces the allocations at the current time s asthe limit

u?i,s(s) := limk→∞

uki,s(s), (15)

of the following average of the iterates of the bids fortime s,

uki,s(s) =

∑k−1s=1 s

θ∑ks=1 s

θuk−1i,s (s) +

kθ∑ks=1 s

θuki,s(s), (16)

with u0i,s(s) = u0i,s(s).This is presented in Algorithm 1.Assumption 2:

(i) There is an optimal solution of (1) with finite cost.(ii)

∑T−1t=0 ci(xi(t), ui(t), t),

∑T−1t=0 ei(ui(t), t), and

Algorithm 1 : Stochastic Dynamic Agents with PrivateUncertainties

for bidding times s = 0 to T − 1 dok = 0

repeatEach agent i solves the problem

Min ET−1∑t=s

[ci(xi(t), ui(t), t) + ei(ui(t), t)

+ λk(t)TKi(t)ui(t)],

with initial condition xi(s) to obtain the optimaluki,s(t), s ≤ t ≤ T − 1 subject to (17,18), andsubmits it to the ISO.The ISO declares new prices for s ≤ t ≤ T − 1, as

λk+1(t) = λk(t) +1

k[

M∑i=1

Ki(t)uki (t)− d(t)].

k → k + 1until λk(t) converges a.s. to λ?(t) for s ≤ t ≤ T − 1.ISO computes uki,s(s) as in (16), and announcesgenerations/consumptions u?i,s(s) := limk→∞ uki,s(s).

end for

hi(wti , w

tc, t, ui(t)) are convex in uT−1i for each noise

sequence wT−1, with xi(t) expanded out recursively andwritten in terms of ut−1i and wt.(iii) For each fixed noise sequence wtc, w

ti , there

exists a feasible u satisfying hi(wti , w

tc, t, ui(t)) < 0

in RelInt(Dom(ci)) ∩ RelInt(Dom(ei)) for for 1 ≤ i ≤M, 0 ≤ t ≤ T − 1.

Theorem 2: The above bid-price solution, with priceupdates (13), bid updates determined as the optimalsolution of (12), and allocations at each t given bylimk→∞ uk where uk is obtained as the averaged versionof uk as in (11), achieves the maximum social welfarethat could have been attained in the full state informationcase, when the cost functions satisfy Assumption 2.Proof: First we analyze how the above Iterative BiddingScheme functions in the case of common uncertainties,i.e., w(t) ≡ wc(t), and there are no private noises wi.In this case it will turn out that we need to conduct thebid-price iteration only at time 0.

Let us suppose that x(0) is fixed, without loss of gen-erality.

For simplicity of exposition only we suppose that thenoise processes wc(t), wi(t) assume only finitely manyvalues, allowing them to be represented by a tree asin Fig. 1. Let δ(v) denote the depth t of a node v =(w(0, ω), w(1, ω), . . . , w(t − 1, ω) in the tree of Fig. 1;it is the time t associated with the node v. Given aMarkov policy π that takes action u = π(x, s) whenthe state x(s) = x, and a noise realization v, one candetermine the resulting state x(t) by recursively applyingthe system function x(s + 1) = f(x(s), u(s), w(s), s) for

Page 8: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

8

s = 0, 1, . . . , t − 1. One can thereby also determine theaction u(t) = π(x(t), t). Thus, for a given Markov policyπ, one has a mapping v 7→ (x, u) that specifies the statereached, and the action taken, at each node v. Sincethe policy is Markov, this mapping has to satisfy theconsistency condition that two nodes that are associatedwith the same state x, and that are at the same level(and hence correspond to the same time t), have to takethe same action. Now let us consider a more generaltree policy σ that specifies an action σ(v) to be taken ineach node v without requiring this consistency condition.The class of tree policies is therefore more general thanthe class of Markov policies. Since the class of Markovpolicies contains an optimal policy, so does the class oftree policies.

Denote by σv := u(0), . . . , u(t) the sequence of ac-tions taken in the preceding t+ 1 steps, where t denotesthe depth of node v. The state x(t) corresponding tov is thereby determined by (v, σv). Let σi(v) := ui(t)and σvi := ui(0), . . . , ui(t) denote the correspondingactions by agent i. The problem (1) can then be writtenequivalently as the following optimization problem, witha decision variable σ(v) associated with each node:

MinM∑i=1

∑v

pv[ci (v, σv) + ei (σv)]

Fi(v)σi(v) ≤ gi(v) +∑

v′:δ(v′)<δ(v)

Ji(δ(v′), v)σi(v

′),

(17)

hi(σi(v), v) ≤ 0, (18)

such thatM∑i=1

Ki(v)σi(v) = d(v),∀v.

Under Assumption 2, the convex programming problemhas no duality gap. Let λ(v) be the Lagrange multiplierfor the constraint

∑Mi=1Ki(v)σi(v) = d(v), and define the

vector λ := λ(v). We obtain,

L (σ, λ) :=

M∑i=1

∑v

pv[∑v

ci(v, uv) + ei(u

v)+

λ(v)TKi(v)σi(v)]−∑v

pvλ(v)T d(v).

The process λ(v) is the “price process”. Each agent sub-mits a bid for each possible future realization v of thenoise process, while the ISO specifies a price at eachv. Now the rest of the proof for the case of commonuncertainties parallels the deterministic one of Theorem1.

The key point to notice is that all the optimal actionsand prices corresponding to each node can be determinedat time 0. After that, since the uncertainties are allcommon and all agents can observe the uncertainty, allagents know exactly which node at level t + 1 the noiseprocess is at at each time t. Hence each agent knowswhich action to apply at each time as the system evolvesover time.

Now we turn to the case with private uncertainties. Inthe private uncertainty case also, all the optimal actionsand prices corresponding to each node can be determinedat time 0. They know what actions to apply at time 0, andthe ISO knows what price to assign at time 0. However,the main problem is that at future times, the agents donot know which node at level t+ 1 the noise process is atat time t ≥ 1 due to the presence of private uncertainties.Hence they do not know what action to apply at futuretimes t ≥ 1.

However, at each future time t ≥ 1 we can simplyconsider the state at that time as a fresh initial condition,and reconduct the bidding process, which then yields theoptimal actions and prices at that time t. Thus the Bid-Price Iteration repeated at each time, with only the initialaction and price implemented at each time, leads to anoptimal solution for the system with private uncertainties.

The major drawback of this algorithm is that it isexponentially complex in T due to the number of states inthe tree, even if each w(t) is finite valued. One may notethe following gradation of complexity in the ISO Problem.

1) In the case where agents have private uncertainties,this is a complex iteration consisting of prices andbids for all future noise states that has to moreoverbe repeated at each time during the evolution of thesystem.

2) If the agents have only a common uncertainty, thenas the proof of Theorem 2 shows, the complexiteration needs to be conducted only at the initialtime.

3) If the system is a deterministic system, thenthe bids or price updates simplify to open-looptime-functions of future consumption/generation orprices, respectively. The reason is that there beingno noise, future noise states are singletons. More-over as in the common information case, of whichthis is special case, the bid-price iteration needs tobe conducted only at the outset at time 0.

Remark 8: In the deterministic static and deterministic dy-namic cases, the problem of bidding between consumersand generators has been extensively studied. Reference[35] studies the robustness and stability of a tattonmentprocess that decides electricity prices for a single timehorizon, and [36] takes a penalty function approach todesign an iterative algorithm that yields optimal operationpoint and simultaneously regulates frequency, also for anessentially static set-up. Li, Chen, and Low [37] consider abidding problem between several households and a singleutility company for the deterministic dynamic case thatfeatures a price update involving the derivative of thecost of obtaining electricity from the wholesale market.Iterative algorithms similar to ours for the deterministicdynamic case are proposed in [38], [39] for particularmodels of dynamic systems, e.g., involving batteries. Thecommon information case is handled in the work of Arrow[7].

Page 9: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

9

VI. THE ISO PROBLEM FOR LQG AGENTS

Start with s = 0

ISO declares prices λ0(t) for times s ≤ t ≤ T − 1, sets k = 0

Each Agent i optimizes consump-tion/generation for s ≤ t ≤ T − 1

for the deterministic model xi(t + 1) =Aixi(t) + Biu

ki (t), xi(s) = xi(s),

with cost∑T−1t=s [xᵀi (t)Qixi(t) +

ui(kt)ᵀRiu

ki (t) + λk(t)uki (t)]

Agents submit bids uki (t) for s ≤ t ≤ T − 1

Bids converged?

UpdatePrices

λk+1(t) =λk(t) +

αk∑i u

ki (t)

for s ≤t ≤ T − 1

Implement the first entry of the convergedbids as ui(s). Agents update statesxi(s + 1) = Aixi(s) + Biui(s) +wi(s). Increment time s by one

Is s = T?

Stop

yes

yes

no

no

Fig. 2: Scheme for ISO Problem with LQG Agents.

We will now show that when the agents are all LQGsystems, one can dramatically simplify the bids and thebidding process. Even though agents have private uncer-tainties, each bid consists of only a simple time-function,just as in the deterministic case. The only difference isthat the bidding needs to be carried out at each time t.Similar to Model Predictive Control, only the first step ofthe prices and consumptions/generations at each time tis implemented. This bidding scheme appears practicallyfeasible with bid periods separated by minutes. Anothersimplifying feature is that the ISO need not average thebids. The bids of agents converge at each time instantwithout averaging to a feasible solution that satisfiesenergy balance and other constraints.

The M agents have linear dynamics affected by Gaus-sian noise and have quadratic costs. Initial conditionsand noises are Gaussian: xi(0) ∼ N(0,Σi,0) and wi(t) ∼w(0, Pi,t), and independent of all others. The cost func-tions of agents, are quadratic, with Qi ≥ 0 and Ri > 0.The ISO Problem is:

Min ET−1∑t=0

M∑i=1

[xᵀi (t)Qixi(t) + uᵀi (t)Riui(t)] (19)

where xi(t+ 1) = Aixi(t) +Biui(t) + wi(t), (20)

withM∑i=1

ui(t) = 0 a.s., for t = 0, 1, . . . , T − 1. (21)

The case of time-varying systems is entirely analogous.

Agents have no knowledge even of each other’s pres-ence. Agent i does not know the value of M , the numberof agents, the matrices Aj , Bj , Qj , Rj ,Σj,0, Pj,·)j 6=i ofother agents, the realizations of their state processesxj(·), j 6= i or noises wj(·), j 6= i.

The k-th iterate of the bid function submitted at time sspecifying the quantity of electricity that agent i is willingto purchase at times s, s+ 1, . . . , T − 1 is not a functionof the outcomes of the noise sequence w(t), t > s. Itis simply a vector (us,ki (s), us,ki (s + 1), . . . , us,ki (T − 1))comprised of T − s+ 1 entries. The same is also true forprices. The ISO just specifies a vector (λs,k(s), λs,k(s +1), . . . , λs,k(T − 1)) of T − s entries.

The key to showing the existence of such a simplebidding scheme lies in utilizing the certainty equivalenceproperty of LQG systems [40].

Algorithm 2 : ISO Problem with LQG Agents

for bidding times s = 0 to T − 1 dok = 0Initialize λ0s(t) : s ≤ t ≤ T − 1 arbitrarily.

repeatEach agent i solves the problem (22) for adeterministic system (23) with initial conditionxki,s(s) := xi(s), where xi(s) is the state of the i-thagent at time s, and submits the optimal values,denoted uki,s(t), for s ≤ t ≤ T − 1 to the ISO.ISO updates the prices according to (25,26).Increment k by 1.

until uki,s(t) converges to u?i,s(t),Implement (u?1,s(s), u

?2,s(s), . . . , u

?M,s(s))

end for

The iterative bidding scheme is illustrated in Fig. 2and Algorithm 2. We bring attention to the following twoassumptions that are formally stated in Theorem 3 below:Assumption 1: The agents are assumed to be compliant inthe sense that they conform to the described scheme.Assumption 2: The scheme specifically requires that agentsemploy a deterministic version of their system for com-puting their bids.

This iterative bid-price scheme achieves the same op-timal social welfare for the LQG ISO Problem attainableunder full-state information, as described below.

Theorem 3: Consider the overall system comprised ofthe M agents (20) for i = 1, 2, . . . ,M , where xi(0) ∼N(0,Σi,0), wi(t) ∼ w(0, Pi,t), and all random variablesxi(0) : 1 ≤ i ≤ M, wi(t) : 1 ≤ i ≤ M, 0 ≤ t ≤ T − 1are independent. Let u(t) := (u1(t), u2(t), . . . , uM (t)). LetFt := σ(xi(0) : 1 ≤ i ≤ M, wi(τ) : 1 ≤ i ≤ M, 0 ≤ τ ≤t) be the σ-algebra generated by all the uncertaintiesupto time t. It is desired to minimize the overall socialwelfare cost (19) subject to the balancing constraints(21), over all u(t) : 0 ≤ t ≤ T − 1 that are adaptedto Ft.

Consider the following bid-price scheme where theprice vector (λ0s(s), λ

0s(s + 1), . . . , λ0s(T − 1)) with real-

Page 10: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

10

valued entries is initialized arbitrarily at each time s =0, 1, . . . , T − 1.

At time s, in response to the (k−1)-th price vector iter-ate (λks(s), λks(s+ 1), . . . , λks(T )) with real-valued entries,announced by the ISO, agent i announces the optimalopen-loop sequence (uki,s(s), u

ki,s(s+ 1), . . . , uki,s(T − 1)) for

the following deterministic Linear Quadratic Regulator(LQR) problem:

min

T−1∑t=s

[xki,s(t)ᵀQix

ki,s(t) + uki,s(t)

ᵀRiu

ki,s(t) + λks(t)uki,s(t)]

(22)

s.t. xki,s(t+ 1) = Aixki,s(t) +Biu

ki,s(t), s ≤ t ≤ T − 1,

(23)

with initial condition xki,s(s) := xi(s). (24)

The ISO thenadjusts the price vector (λk+1s (s), λk+1

s (s+1), . . . , λk+1

s (T − 1)) as:

λk+1s (t) = λks(t) + αk

M∑i=1

uki,s(t), s ≤ t ≤ T − 1. (25)

where αkis any sequence of positive numbers satisfying

αk > 0, limkαk = 0,

∞∑k=0

αk = +∞. (26)

At time s, the iterations in k are continued till the priceiterations (λks(s), λks(s + 1), . . . , λks(T − 1)) converge to(λ?s(s), λ

?s(s+1), . . . , λ?s(T−1)). Denote the corresponding

limit of the input sequence of agent i by (u?i,s(s), u?i,s(s+

1), . . . , u?i,s(T − 1)).The price at time s is then set to λ?s(s) and each agent

i applies the input u?i,s(s). This is repeated at time s+ 1.Then the sequence u?i,t(t) : 0 ≤ t ≤ T − 1 attains the

minimum of the cost (19) over all control laws adaptedto Ft that satisfy the balancing constraint (21).Proof: Let

X := (X1, X2, . . . , XM ), U := (U1, U2, . . . , UM ),

A := diag(A1, A2, . . . , AM ), B := diag(B1, B2, . . . , BM ),

Q = diag(Q1, Q2, . . . , QM ), R = diag(R1, R2, . . . , RM ),

and consider the following deterministic constrained LQRproblem, with no noise, and featuring energy balance:

min

T∑t=0

[Xᵀ(t)QX(t) + Uᵀ(t)RU(t)], subject to (27)

X(t+ 1) = AX(t) +BU(t);X(0) = x(0) and (21). (28)

Since the state is affine in U , after substituting forthe states, we have a positive definite quadratic pro-gramming problem with equality constraints. The Karush-Kuhn-Tucker matrix is nonsingular (Section 10.1 of [41])since Ri > 0, and so there are unique U?, λ? optimalfor the primal and dual, respectively. The Dual functionis a differentiable concave quadratic function, and thesubgradient method is actually a gradient method thatconverges under non-summability of step-sizes, without

even requiring square summability (Section 2.5 of [42]).The iterative bids Uki are affine functions of the pricesλk. The limiting prices yield bids that satisfy balancing.Hence this deterministic problem can be solved by theBid-Price iteration involving only time-functions of pricesand consumptions/generations between the agents andthe ISO to obtain the optimal inputs U(t) for 0 ≤ t ≤ T−1,as noted for the case of deterministic problems in SectionV.

However, at the particular time s = 0 with xki,0(0) =xi(0), the Bid-Price Iteration (22,23,24) and (25) is thesame as the Bid-Price Iteration in Section IV with the sim-plification that the constraint (5) is absent, and Ki ≡ 1,d = 0 in (7). Hence the end result of Algorithm 2 at times = 0 is the optimal action for (27,28),

u(0) = U(0). (29)

Now note that due to energy balance, no matterhow the first (M − 1) agents choose their consump-tions/generations, agent M ’s choice is forced to be

UM (t) = −M−1∑i=0

Ui(t) for all t, (30)

due to the energy balance constraint. Hence one cansubstitute for UM (t) and obtain an equivalent standard,i.e., unconstrained, deterministic LQR problem featuringonly (M−1) inputs Ureduced := (U1, U2, . . . , UM−1), wherethere is no energy balance constraint:

min

T∑t=0

[Xᵀ(t)QX(t) + Uᵀreduced(t)RreducedUreduced(t)]

(31)

subject to X(t+ 1) = AX(t) +BreducedUreduced(t), (32)

the deterministic reduced unconstrained LQR problem.For this problem (31,32), which is just a standard

unconstrained LQR Problem, the optimal solution is givenby linear feedback Ureduced(0) = Γreduced(0)X(0), whereΓreduced(·) is the optimal feedback gain.

Noting that UM is linear in Ureduced, we deduce that forthe full system (27,28) with all M agents, the optimalsolution for the deterministic constrained LQR problemwith the energy balance constraint, is U(0) = Γ(0)x(0),where Γ(·) is the optimal feedback gain obtained fromΓreduced through (30).

Now consider the corresponding reduced unconstrainedstochastic LQG problem where there is white Gaussiannoise in the state equations (28):

minET∑t=0

[xᵀ(t)Qx(t) + uᵀreduced(t)Rreducedureduced(t)]

(33)

with x(t+ 1) = Ax(t) +Breducedureduced(t) + w(t). (34)

By Certainty Equivalence [40], the same linear feedbackgain as in the deterministic reduced LQR problem is alsooptimal. In particular, u(0) = Γ(0)x(0) continues to be

Page 11: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

11

optimal at time t = 0. Thus u(0) given by (29) is optimalfor (33,34).

However, reduced unconstrained stochastic LQG prob-lem (33,34) is equivalent to unreduced constrained LQGproblem (19,20,21), and so the same u(0) is optimal.

Thus the Bid-Price iteration scheme determines theoptimal actions for the agents at time 0. Our scheme(22,23,24) for the LQG problem repeats such a Bid-Pricescheme iteration at each time s = 0, 1, . . . , T−1. Each x(s)can be regarded as an initial state for a subsequent systemre-started at time s, and the above argument shows thatthe actions u(s) that it results in for the agents at all timess are also optimal, completing the proof.

Concerning the convergence rate of the bid-price iter-ates, it may be noted that in this primal strictly positivedefinite quadratic problem with linear equality constraints(27,28), the dual is a strictly negative definite quadraticin prices. Hence standard results on convergence of thegradient method for quadratics apply for the price itera-tions [42], and the convergence rate for the bids followssince they are simply affine functions of the prices. Forexample, one may even use a small enough constant stepsize αk ≡ α > 0, in which case the iterates convergelinearly, i.e., geometrically.

The result of this section extends to LQG systems whereeach agent i only has noisy observations yi(t) = Dixi(t)+vi(t), where vi are independent and Gaussian.

We note that the LQG modeling constitutes a richframework that has been successfully used in controlwith soft quadratic costs used to capture preferences andconstraints. This is illustrated in the example treated inSection VIII.Remark 9: Team problems have been extensively studied,e.g., [43]–[45], but it should be noted that those formu-lations do not apply here since agents don’t know thesystem dynamics or models of other agents. Moreover,typically, almost sure constraints such as balancing arenot treated. Even setting these issues aside, and sup-posing that the models are known to all and assumingthere are no almost sure constraints, the ISO Problemwith private measurements would still lie at the coreof decentralized stochastic control with a non-classicalinformation structure [29], [44]–[49], since agents’ ob-servations are influenced by the unknown actions of otheragents, as happens here because the price announcementsthat constitute part of an agent’s observations are de-pendent on unknown actions taken by other agents. Thisposes serious difficulties potentially leading to intractabil-ity even for LQG systems, as shown by Witsenhausen’scounterexample of a two-stage problem [46]. So thereare two questions that one may ponder: (i) Given theprices, how should the agents act? (ii) What are theright prices and how are they determined? Concerning thefirst question, the linear in state (for given prices) controllaw may be explained by the fact that given the prices,the agents are conditionally independent of each other.Yuksel [50] has examined such conditional independenceconditions on agents’ observations, actions and states,

and addressed when tractable solutions can be found.Concerning the second question, one can view the priceannouncements as a form of “signaling.” In decentralizedstochastic control [45], [49], [51], controllers can gener-ally signal some private information to other agents overa “channel” which may even be the physical plant itself.The roles of observation, signaling [44], and the trade-off between communication and control are evident fromWitsenhausen’s counterexample [46].

VII. INCORPORATING ADDITIONAL LINEAR CONSTRAINTS:THE DC OPTIMAL POWER FLOW EQUATIONS

As noted in Remark 3, the bid-price iterations can beextended to encompass any additional linear constraints,such as those arising from the DC Power Flow Equations.The only difference is that there are several prices, onefor each constraint, that each agent needs to incorporatein choosing its actions.

Theorem 4: Consider a system consisting of M agents,where each agent i’s system is a Linear Gaussian System:

xi(t+ 1) = Aixi(t) +Biui(t) + wi(t)] a.s.

Agent i has a quadratic cost (negative utility):

minE

(T−1∑t=0

[xᵀi (t)Qixi(t) + uᵀi (t)Riui(t)]

).

There are N linear constraints that need to be satisfied:M∑i=1

γi,nui(t) = 0 for 1 ≤ n ≤ N, t = 0, 1, . . . , T − 1, a.s.

Neither ISO nor agents know the number M or thedynamics/costs/states/noises of other agents.

Consider the following Bid-Multiple Price Iteration. Ateach time s = 0, 1, . . . , T−1, at each iterate k, in responseto prices λkn,s(t) : s ≤ t ≤ T − 1, announced by the ISO,agent i solves the deterministic LQR problem:

min

T−1∑t=s

[xᵀi (t)Qixi(t) + ui(t)ᵀRiui(t) +

N∑n=1

λkn,s(t)ui(t)],

with xi(s) = xi(s), determines the optimal uks(t) : s ≤t ≤ T − 1, and communicates this sequence to the ISO.Upon receiving the bids at iterate k from all the agents attime s, the ISO updates the N price sequences:

λk+1n,s (t) = λkn,s(t) + αk

(M∑i=1

γi,nuki,s(t)

),

for 1 ≤ n ≤ N and s ≤ t ≤ T − 1, with the step-sizessatisfying (26). The multiple iterations converge, and letλ?n,s(t) : s ≤ t ≤ T−1 denote the limit. Correspondinglylet u?s(t) : s ≤ t ≤ T −1 denote the limits of the bids bythe agents. At each time s, agent i applies ui(s) = u?s(s).Then this Bid-Multiple Price Iteration yields the maximumsocial welfare that would have been attainable under fullstate information, under the multiple constraints.Proof: The proof parallels the single constraint case.

Page 12: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

12

VIII. SIMULATION EXAMPLES

In the following, we use the space conditioning examplefrom [52] for thermal inertial load agents. Let S1, S2, S3

be sets of conditioning facilities (loads), conventionalgenerators, and renewable suppliers, respectively, and leti ∈ S1, j ∈ S2, k ∈ S3. The dynamics of the temperaturexi(t) of the i-th facility is given by (35), where xO(t)= the outside temperature at time t, ε = e−τ/TC =“factor of inertia”, TC = 2.5 hours = time-constant ofthe system, τ = time duration between control epochs,which is the same as the inter-bid duration, η = 2.5= thermal conversion efficiency, and A = 0.14kW/F= overall thermal conductivity. With xdi (t) the desiredfacility temperature, the cost incurred is a quadratic inthe temperature deviation. For fossil-fuel generators, theunit-time conventional generation cost curves [53] forsupplying energy are quadratic in generation uj . Wereplace hard constraints on ramp-rates |uj(t)−uj(t−1)| bya quadratic penalty, with J3 below chosen so that the hardbounds are met, with state given by (36). For a renewableenergy facility k, Bk denotes its buffer capacity, wk(t)stochastic wind/solar energy, and xk(t) the renewableenergy level satisfying (37). Its operating cost is constant.The resulting ISO Problem (1) is

minE

∑i∈S1

T−1∑t=0

(xi(t)− xdi (t)

)2+∑j∈S2

(Jj,1uj(t) + Jj,2u

2j (t) + Jj,3 (uj(t)− xj(t))2

)such that

M∑`∈S1∪S2∪S3

u`(t) = 0, for t = 1, 2, . . . , T − 1,

xi(t+ 1) = εxi(t) + (1− ε)(xOi (t) +

η

Aui(t)

), (35)

xj(t+ 1) = uj(t), (36)

xk(t+ 1) = Minxk(t)− uk(t) + wk(t), Bk. (37)

We will compare the performance of the proposedStochastic Dynamic Optimal Bid-Price Iteration schemeof Sections V or VI, called ”Optimal” below, with thecurrently followed Static Dispatch scheme of Section IVused in dynamic situations as explained in Section I,under which the agents perform separate and uncoupledbid-price iterations at each time t to optimize the staticcost c(x(t), u(t)) incurred at that time t.Bidding with LQG Systems: A day is divided into twelveτ = 2 hour bid-slots, so ε = 0.4493. There are onlythermal loads, and wind-farms which have a cost func-tion 1

2x2(t) and with infinite storage capacity B. Outside

temperatures and available wind power are modeledas i.i.d. and normal. (This is only a first step towardsmodeling the uncertainty, and other types of distributionscan potentially be similarly explored). Variance of windenergy is 1 unit for all t. The scenario is described inTable I. At the beginning of day, the thermal loads havetemperature of 70F , while wind-farms have 100 units of

energy. The price vector is projected at each update ontoa large compact set, and, at termination, the bid vector isprojected onto the hyperplane

∑i ui = 0.

Figs. 3-5 compare performance of the two schemes asthe number of bid-price updates, the number of agentsconnected to the grid, and variance of wind energy pro-cess, are varied. Figs. 6 and 7 show how the Optimalscheme is able to attain better social welfare for scale 2.

55 60 65 70 75 80 70 60 50 50 50 5080 80 80 80 60 60 60 60 80 80 80 8030 40 0 70 70 70 70 0 0 0 0 0Wind Power

Outside Temp. Desired Temp.

TABLE I: Mean outside and desired thermal load temper-atures (in F ), and mean wind power for the 12 periods.

10 12 14 16 18 20 22 24 26 28 30

Number of Bid-Price Updates per bid instant

0.8

1

1.2

1.4

1.6

1.8

2

Co

st

×105

Optimal Dispatch

Static Dispatch

Fig. 3: Cost, i.e., negative social welfare, vs. numberof Bid-Price iterations with five thermal loads and twowindfarms.

1 1.5 2 2.5 3 3.5 4 4.5 5

Scale

0

2

4

6

8

10

Co

st

×105

Optimal Dispatch

Static Dispatch

Fig. 4: Cost as number of users is scaled linearly by i, withratio of thermal loads to windfarms held constant at 5/2,and with = 15 + 5i bid-price iterations at each time t.

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Wind Energy Variance

1.5

2

2.5

3

3.5

4

Co

st

×105

Optimal Dispatch

Static Dispatch

Fig. 5: Cost vs. wind variance with five thermal loads andtwo windfarms, with 30 Bid-Price iterations.

Bidding in Tree Scenario: The time-horizon is 2 and timeduration between two bids is 5 hours, roughly coincidingwith morning (7 am)/12 noon, giving ε = 0.1353. Table II

Page 13: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

13

0 2 4 6 8 10 12

Time

0

100

200

300

400

500

Ne

t P

ow

er

Su

pp

lied

Optimal Dispatch

Static Dispatch

Fig. 6: Power generation: Optimal scheme “predicts” theincumbent energy shortage in advance, thereby elicitingsmoother generator response. The power production costsfor the two schemes are, respectively, 4.37, 28.06 (×104),while thermal loads disutility are 13.15, 9.62 (×104).Adding these two costs, the net costs are 1.75, 3.76 (×105),so that savings achieved by Optimal Scheme is 53.5%.

0 2 4 6 8 10 12

Time

-200

-150

-100

-50

0

50

Mark

et

Cle

ari

ng

Pri

ce

Optimal Dispatch

Static Dispatch

Fig. 7: Prices: Optimal scheme “declares” the energyshortage/surplus well in advance, allowing users to reactappropriately and eliciting demand response.

lists stochasticity parameters of wind for two scenarios.For fossil plants, J1 = 0.1, J2 = 0.01, J3 = 0.1. Windfarmsincur no operational cost. Bid/price vectors at t = 0 havethree entries, while they are scalar at time t = 1.

TO(1),TO(2) Td(1),Td(2) w1,w2 P1,P2 S1/S2/S3 BCase 1 30,40 (in F) 60,80 5,0 0.5,0.5 7/1/1 30Case 2 40,60 60,90 10,0 0.95,0.05 4/1/1 40

TABLE II: The only stochasticity is wind availability at time1, with possible realizations w1, w2 with respective prob-abilities P1, P2. |S1|/|S2|/|S3| are the relative numbers ofthermal loads, fossil plants and windmills.

Figures 8-10 compare the costs averaged over multiplewind realizations of the two policies under various sce-narios, for the two schemes. Thermal loads are allowedto become energy producers, while wind-farm operatorsare allowed to store energy in case there is excess energysupply in the market, showcasing potential prosumerbehavior in energy markets. The particular prices andpower generations for Scenario 1, are shown in Table III.

λ(1) λ(2) Powerfort=1 Powerfort=2 NetOperationCost %Savings7.6421 6.6159 69.0021 142.2757 878.2477 25.59

30 40 86.5085 116.4229 11803 -StaticOptimal

TABLE III: Prices, power generation and cost savings.

4 6 8 10 12 14 16 18 20

Number of Bid-Price Updates

0

500

1000

1500

2000

Co

st

Optimal Dispatch-1

Static Dispatch-1

Optimal Dispatch-2

Static Dispatch-2

Fig. 8: Performance as a function of number of Bid-Priceupdates for the two scenarios in Table II.

1 1.5 2 2.5 3 3.5 4

Scale

0

1000

2000

3000

4000

5000

6000

Co

st

Optimal Dispatch-1

Static Dispatch-1

Optimal Dispatch-2

Static Dispatch-2

Fig. 9: Cost as number of agents is increased linearly withscale, in the ratio S1/S2/S3 shown in Table II.

IX. CONCLUDING REMARKS

The ISO problem gives rise to a problem in generalequilibrium theory that is complicated by the facts thatagents have private uncertainties but one wants to attainnot the price equilibrium that would hold naturally underthe corresponding private information structures, but thesocial welfare optimal solution that could be attainedwere all agents to pool all their observations, with thefurther restriction that this is to be accomplished throughthe medium of an ISO that can only interact through priceannouncements and consumption/generation bids withagents. This problem of maximizing the social welfareof a collection of distributed dynamic stochastic agentsis more complex than decentralized stochastic controlsince agents do not know the dynamical equations orutility functions of others, and also by the almost sureconstraints.

We have exhibited iterative bidding schemes that attainthe performance that could have been attained by acentralized control policy that is aware of the dynamics,utilities, uncertainties and states of all agents, underappropriate compactness-convexity or LQG assumptions.The ISO critically exploits the information obtained duringthe iterative bid-price process to determine the optimalprices and generation/consumption allocations. In theLQG case, the bid-price iteration is particularly simpleand tractable. It yields the optimal stochastic dynamiclocational marginal prices.

The social-welfare optimality can potentially result insignificant economic benefits in energy markets with deeprenewable penetration.

The results may be of interest vis-a-vis general equilib-

Page 14: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

14

1 1.5 2 2.5 3 3.5 4 4.5 5

Buffer Size/10

0

200

400

600

800

1000

1200

1400

1600

1800

Co

st

Optimal Dispatch-1

Static Dispatch-1

Optimal Dispatch-2

Static Dispatch-2

Fig. 10: Cost as wind availability at t = 1, and storagebuffer at windfarms are increased. Buffer capacity in i-thsimulation is 10i, while wind energy w(1) is i. Tempera-ture conditions and agents are as in Table II.

rium theory for its treatment of systems with stochasticdynamic agents.

The agents are all presumed to be “price takers.” Ex-amining this in a broader context is an important issueand is the subject of a future work.

ACKNOWLEDGMENT

The authors thank Pravin Varaiya for identifying asignificant error in an earlier version of the paper.

REFERENCES

[1] F. Wu, K. Moslehi, and A. Bose, “Power system control centers:Past, present, and future,” Proc. IEEE, vol. 93, 1890-1908, 2005.

[2] L. Walras, Elements of Pure Economics, ser. History of economicthought. Routledge, 2010.

[3] K. J. Arrow, “An extension of the basic theorems of classicalwelfare economics,” Proceedings of the Second Berkeley Symposiumon mathematical Statistics and Probability, pp. 507–532, 1951.

[4] G. Debreu, “The coefficient of resource utilization,” Econometrica,vol. 19, no. 3, pp. 273–292, 1951.

[5] K. Arrow and G. Debreu, “Existence of an equilibrium for acompetitive economy,” Econometrica, vol. 22, pp. 265–290, 1954.

[6] K. Arrow, H. Block, and L. Hurwicz, “On the stability of thecompetitive equilibrium, II,” Econometrica, vol. 27, 82-109, 1959.

[7] K. J. Arrow, “The role of securities in the optimal allocation of risk-bearing,” The Review of Economic Studies, vol. 31, 91-96, 1964.

[8] R. Radner, “Competitive equilibrium under uncertainty,” Economet-rica, vol. 36, pp. 31–58, Jan 1968.

[9] M. C. Caramanis, R. E. Bohn, and F. C. Schweppe, “Optimal spotpricing: practice and theory,” IEEE Transactions on Power Apparatusand Systems, no. 9, pp. 3234–3245, 1982.

[10] R. E. Bohn, M. C. Caramanis, and F. C. Schweppe, “Optimal pricingin electrical networks over space and time,” The Rand Journal ofEconomics, pp. 360–376, 1984.

[11] W. Hogan, “Contract networks for electric power transmission,”Jour. Regulatory Economics, vol. 4, pp. 211–242, 1992.

[12] “ERCOT Business Practice: Ancillary Service Market Trans-action in the Day-Ahead and Real-Time Adjustment Pe-riod, Version 1.0,” http://www.ercot.com/content/mktrules/bpm/BusinessPractice AS MarketSubmissions Version1 1.doc.

[13] M. Roozbehani, M. A. Dahleh, and S. K. Mitter, “Volatility of powergrids under real-time pricing,” IEEE Transactions on Power Systems,vol. 27, no. 4, pp. 1926–1940, Nov 2012.

[14] H.-P. Chao and S. Peck, “A market mechanism for electric powertransmission,” Jour. Regulatory Economics, vol. 10, pp. 25–59,1996.

[15] M. Huneault, F. Galiana, and G. Gross, “A review of restructuringin the electricity business,” in Proceedings of 13th Power SystemsComputation Conference, 1999, pp. 19–31.

[16] L. Xie, P. Carvalho, L. Ferreira, J. Liu, B. Krogh, N. Popli, andM. Ilic, “Wind integration in power systems: Operational chal-lenges and possible solutions,” Proc. IEEE, vol. 99, 214-232, 2011.

[17] G. Debreu, “Theory of value, new york city,” 1959.[18] E. Gustavsson, M. Patriksson, and A.-B. Stromberg, “Primal con-

vergence from dual subgradient methods for convex optimization,”Mathematical Programming, vol. 150, pp. 365–390, 2015.

[19] R. Rajagopal, E. Bitar, P. Varaiya, and F. Wu, “Risk-limiting dis-patch for integrating renewable power,” International Journal ofElectrical Power and Energy Systems, vol. 44, pp. 615–628, 2013.

[20] “Pigovian tax,” https://en.wikipedia.org/wiki/Pigovian tax.[21] A. Bergen and V. Vittal, Power Systems Analysis. Prentice, 2000.[22] F. Wu, P. Varaiya, P. Spiller, and S. Oren, “Folk theorems on

transmission access: Proofs and counterexamples,” Journal of Reg-ulatory Economics, vol. 10, no. 1, pp. 5–23, 1996.

[23] D. Kirschen, R. Allan, and G. Strbac, “Contributions of individualgenerators to loads and flows,” IEEE Transactions on Power Systems,vol. 12, no. 1, pp. 52–60, 1997.

[24] L. B. Lave, J. Apt, and S. Blumsack, “Rethinking electricity dereg-ulation,” The Electricity Journal, vol. 17, no. 8, pp. 11–26, 2004.

[25] W. W. Hogan, “Transmission market design,” 2003, KSG WorkingPaper No. RWP03-040.

[26] A. Chuang, F. Wu, and P. Varaiya, “A game-theoretic model forgeneration expansion planning: problem formulation and numer-ical comparisons,” IEEE Trans. Power Systems, vol. 16, 885–891,2001.

[27] R. Baldick, R. Grant, and E. Kahn, “Theory and application oflinear supply function equilibrium in electricity markets,” Jour.Regulatory Economics, vol. 25, pp. 143–167, 2004.

[28] M. Ilic, J. Joo, L. Xie, M. Prica, and N. Rotering, “A decision-makingframework and simulator for sustainable electric energy systems,”IEEE Trans. Sustainable Energy, vol. 2, pp. 37–49, 2011.

[29] P. Carpentier, J. Chancelier, G. Cohen and M. De Lara, StochasticMulti-Stage Optimization. At the Crossroads between Discrete TimeStochastic Control and Stochastic Programming. Springer, 2015.

[30] M. De Lara, P. Carpentier, J.P. Chancelier and V. Leclere, “Op-timization methods for the smart grid,” Report Commissioned byConseil Franais de l’Energie, Oct 2014.

[31] S. Ryan, R. Wets, D. Woodruff, C. Silva-Monroy, and J. Watson,“Toward scalable, parallel progressive hedging for stochastic unitcommitment,” in 2013 IEEE PES, July 2013, pp. 1–5.

[32] M. Morishima, Walras’ Economics : A Pure Theory of Capital andMoney. Cambridge, U.K.: Cambridge University Press, 1977.

[33] K. Anstreicher and L. Wolsey, “Two “well-known” properties ofsubgradient optimization,” Mathematical Programming, vol. 120,pp. 213–220, 2009.

[34] J.-C. Culioli and G. Cohen, “Decomposition/coordination algo-rithms in stochastic optimization,” SIAM Journal on Control andOptimization, vol. 28, no. 6, pp. 1372–1403, 1990.

[35] A. K. Bejestani and A. Annaswamy, “A dynamic mechanism forwholesale energy market: Stability and robustness,” IEEE Transac-tions on Smart Grid, vol. 5, no. 6, pp. 2877–2888, 2014.

[36] D. J. Shiltz, M. Cvetkovic, and A. M. Annaswamy, “An integrateddynamic market mechanism for real-time markets and frequencyregulation,” IEEE Transactions on Sustainable Energy, vol. 7, no. 2,pp. 875–885, 2016.

[37] N. Li, L. Chen, and S. H. Low, “Optimal demand response basedon utility maximization in power networks,” in 2011 IEEE powerand energy society general meeting. IEEE, 2011, pp. 1–8.

[38] Q. Wang, M. Liu, and R. Jain, “Dynamic pricing of power in smart-grid networks,” in 2012 IEEE 51st IEEE Conference on Decision andControl (CDC). IEEE, 2012, pp. 1099–1104.

[39] J. Knudsen, J. Hansen, and A. M. Annaswamy, “A dynamic marketmechanism for the integration of renewables and demand re-sponse,” IEEE Transactions on Control Systems Technology, vol. 24,no. 3, pp. 940–955, 2016.

[40] P. R. Kumar and P. Varaiya, Stochastic Systems: Estimation, Identi-fication and Adaptive Control. Philadelphia: SIAM, 2016.

[41] S. Boyd and L. Vandenberghe, Convex Optimization. New York,NY, USA: Cambridge University Press, 2004.

[42] D. P. Bertsekas, Nonlinear Programming, ser. Athena scientific.Athena Scientific, 1999.

[43] J. Marschak and R. Radner, Economic Theory of Teams. Yale, 1972.[44] S. Yuksel and T. Basar, Stochastic Networked Control Systems: Sta-

bilization and Optimization under Information Constraints. NewYork, NY: Springer, 2013.

[45] J. H. van Schuppen and T. Villa, Coordination Control of DistributedSystems. Springer Publishing Company, 2014.

[46] H. S. Witsenhausen, “A counterexample in stochastic optimumcontrol,” SIAM J. on Control, vol. 6, pp. 131–147, 1968.

Page 15: Decentralized Control via Dynamic Stochastic Prices: The ...cesg.tamu.edu/.../2014/09/Decentralized-Control-via-Dynamic-Stocha… · generation equalling consumption in each time

15

[47] H. Witsenhausen, “Separation of estimation and control for dis-crete time systems,” Proc. IEEE, vol. 59, pp. 1557–1566, 1971.

[48] Demosthenis Teneketzis, “Perturbation methods in decentralizedstochastic control,” Ph.D. dissertation, MIT, Nov 1976.

[49] A. Nayyar, A. Mahajan, and D. Teneketzis, “The Common-Information Approach to Decentralized Stochastic Control,” inInformation and Control in Networks. Springer, 123-156, 2014.

[50] S. Yuksel, “Stochastic nestedness and the belief sharing informa-tion pattern,” IEEE Transactions on Automatic Control, vol. 54,no. 12, pp. 2773–2786, 2009.

[51] N. Sandell, P. Varaiya, M. Athans, and M. Safonov, “Survey ofdecentralized control methods for large scale systems,” IEEE Trans.Auto. Cont., vol. 23, no. 2, pp. 108–128, Apr 1978.

[52] P. Constantopoulos, F.C. Schweppe and R.C. Larson, “Estia: A real-time consumer control scheme for space conditioning usage underspot electricity pricing,” Computers & Operations Research, vol. 18,no. 8, pp. 751 – 765, 1991.

[53] A. Wood and B. Wollenberg, Power Generation Operation andControl. Wiley & Sons, New York, 1996.

Rahul Singh received the B.E. degree in electri-cal engineering from Indian Institute of Technol-ogy, Kanpur, India, in 2009, the M.Sc. degree inElectrical Engineering from University of NotreDame, South Bend, IN, in 2011, and the Ph.D.degree in electrical and computer engineeringfrom the Department of Electrical and ComputerEngineering Texas A&M University, College Sta-tion, Tx, in 2015.

He is currently a Postdoctoral Associate atthe Laboratory for Information Decision Systems

(LIDS), Massachusetts Institute of Technology. His research interestsinclude decentralized control of large-scale complex cyberphysical sys-tems, operation of electricity markets with renewable energy, andscheduling of networks serving real time traffic.

P. R. Kumar B. Tech. (IIT Madras, ‘73), D.Sc.(Washington University, St. Louis, ‘77), was afaculty member at UMBC (1977-84) and Univ. ofIllinois, Urbana-Champaign (1985-2011). He iscurrently at Texas A&M University. His currentresearch is focused on stochastic systems, energysystems, wireless networks, security, automatedtransportation, and cyberphysical systems.

He is a member of the US National Academyof Engineering, The World Academy of Sciences,and a Foreign Fellow of the Indian National

Academy of Engineering. He was awarded a Doctor Honoris Causaby ETH, Zurich. He has received the IEEE Field Award for ControlSystems, the Donald P. Eckman Award of the AACC, Fred W. EllersickPrize of the IEEE Communications Society, the Outstanding ContributionAward of ACM SIGMOBILE, the Infocom Achievement Award, and theSIGMOBILE Test-of-Time Paper Award. He is a Fellow of IEEE andACM Fellow. He was Leader of the Guest Chair Professor Group onWireless Communication and Networking at Tsinghua University, is aD. J. Gandhi Distinguished Visiting Professor at IIT Bombay, and anHonorary Professor at IIT Hyderabad. He was awarded the DistinguishedAlumnus Award from IIT Madras, the Alumni Achievement Award fromWashington Univ., and the Daniel Drucker Eminent Faculty Award fromthe College of Engineering at the Univ. of Illinois.

Le Xie (S’05-M’10-SM’16) received the B.E. de-gree in electrical engineering from Tsinghua Uni-versity, Beijing, China, in 2004, the M.Sc. degreein engineering sciences from Harvard Univer-sity, Cambridge, MA, in 2005, and the Ph.D.degree in electrical and computer engineeringfrom Carnegie Mellon University, Pittsburgh, PA,in 2009.

He is currently an Associate Professor withthe Department of Electrical and Computer Engi-neering, Texas A&M University, College Station.

His research interest includes modeling and control of large-scale com-plex systems, smart grid application with renewable energy resources,and electricity markets.