37
Introduction Value Function Calculating Transition Matrix Approximate Dynamic Programming . Vadym Omelchenko Faculty of Mathematics and Physics, Charles University in Prague and Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic Model of Approximate Dynamic Programming Applied on Day-Ahead Trading of a Renewable Producer of Energy

Vadym Omelchenko Model of Approximate Dynamic Programming

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

.

Vadym Omelchenko

Faculty of Mathematics and Physics, Charles University in Prague andInstitute of Information Theory and Automation, Academy of Sciences of the Czech Republic

Model of Approximate Dynamic Programming Applied onDay-Ahead Trading of a

Renewable Producer of Energy

Page 2: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Set-up of a renewable producer

Page 3: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

1) Renewable producer generates energy but he does not knowhow much he will generate in the following day due touncertainties entailed by weather.2) We assume that the producer is penalized for insufficientdelivery of energy because it corresponds to market conditions andbecause some countries have introduced such a system, e.g.Bulgaria.3) In our settings, the state space is a two-dimensional variablethat consists of wind data and electricity price.5) Our goal is to determine a bidding strategy of the producer byusing dynamic programming.

Page 4: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

The value function is defined as follows:

Vt(St) = maxxt (Ct(St , xt) + E {Vt+1(St+1)|St})where VT+1 = 0and C·(·, ·) is the reward function.

By the way

1) V1(S1) = maxx1,x2,..,xT

∑Tt=1 Ct(St , xt)

2) Vt(St) = maxxt≥0

(Ct(St , xt) +

∑s′∈S P(s ′|x , s)Vt+1(s

′))

where P(s ′|x , s) is the transition function whose calculation is achallenging task of dynamic programming.

Page 5: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

The value function is defined as follows:

Vt(St) = maxxt (Ct(St , xt) + E {Vt+1(St+1)|St})where VT+1 = 0and C·(·, ·) is the reward function.

By the way

1) V1(S1) = maxx1,x2,..,xT

∑Tt=1 Ct(St , xt)

2) Vt(St) = maxxt≥0

(Ct(St , xt) +

∑s′∈S P(s ′|x , s)Vt+1(s

′))

where P(s ′|x , s) is the transition function whose calculation is achallenging task of dynamic programming.

Page 6: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

The value function is defined as follows:

Vt(St) = maxxt (Ct(St , xt) + E {Vt+1(St+1)|St})where VT+1 = 0and C·(·, ·) is the reward function.

By the way

1) V1(S1) = maxx1,x2,..,xT

∑Tt=1 Ct(St , xt)

2) Vt(St) = maxxt≥0

(Ct(St , xt) +

∑s′∈S P(s ′|x , s)Vt+1(s

′))

where P(s ′|x , s) is the transition function whose calculation is achallenging task of dynamic programming.

Page 7: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

There is a special case of reward functions when they depend notonly on the current state but also on the the next state/states. Inthis case, the value function will be represented as follows:

Ct(St , xt , St+1)

The value function will be then in the following form :

Vt(St) = maxxt≥0 E (Ct(St , xt , St+1) + Vt+1(St+1)|St)

Page 8: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

In our settings, we have St = (yt , pt) where yt is the amount ofelectricity produced and pt is the market price of electricity.

if xt > y ′t+1 thenCt(St , xt , St+1) = (yt+1 + c−)pt+1 − u · pt+1(xt − c− − yt+1).

if xt ≤ y ′t+1 thenCt(St , xt , St+1) = xtpt+1 + o · pt+1(yt+1 − xt − c+).

c+ (c−) is the amount of energy charged (discharged).

Page 9: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

In our settings, we have St = (yt , pt) where yt is the amount ofelectricity produced and pt is the market price of electricity.

if xt > y ′t+1 thenCt(St , xt , St+1) = (yt+1 + c−)pt+1 − u · pt+1(xt − c− − yt+1).

if xt ≤ y ′t+1 thenCt(St , xt , St+1) = xtpt+1 + o · pt+1(yt+1 − xt − c+).

c+ (c−) is the amount of energy charged (discharged).

Page 10: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Finite and Infinite Horizon Problems.

1) At some t < ∞ we have VT+1 = 0. Knowledge of the valuefunction at the terminal state enables to calculate the valuefunction backward in time2) T is tending to infinity.

Page 11: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Modeling prices

We model prices by means of AR(1) model with a stable residuali.e.

Pricet = a · Pricet−1 + εt , t = 1, 2, 3, ....

The we determine the estimate a of the parameter a as follows:

a = argmina

∑Tt=1 |Pricet − a · Pricet−1|

Page 12: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Modeling the residuals of prices

Having obtained the estimate a, we can get the series of theresiduals as follows:

εp t = Pricet − a · Pricet−1

The analysis of the residuals will follow below

Page 13: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Modeling the Wind Production

There is the dependence of Wind Production on the wind speed(let us denote it as ”wind”):

WindProduction = c ·Wind3 where c is a positive constant

The square root of wind speed can be modeled by AR(1) process.Taking into account the dependence of Wind Production on windspeed we modeled WindProduction1/6 by AR(1) process.

Page 14: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

THE DATA

We have the data of Polish wind production and Polish electricityprices for the period from May 2011 to March 2013.

Page 15: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Visualization of the test of googness of fit. Residuals of Prices ofAR(1) process modeled by stable AR(1) process. Kolmogorov-Smirnovand Anderson-Darling tests confirmed the hypothesis that the residualshave the stable distribution S1.562(1, 0, 0)

Page 16: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Visualization of the test of googness of fit. Residuals of Wind ofAR(1) process modeled applied on WindProduction1/6.Kolmogorov-Smirnov and Anderson-Darling tests confirmed thehypothesis that the residuals have the stable distribution S1.651(1, 0, 0)

Page 17: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices

Assumption 1. Residuals are independent. We can assumedifferent tail index.Assumption 2. Residuals are not independent because windaffects prices. Sub-Gaussian.Assumption 3. Residuals are not independent and we assume thatthe tail index is different for wind production and prices.

Page 18: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices

Assumption 1. We can analyse the residuals separately. Easy toimplement.Assumption 2. Sub-Gaussian distributions can be expressed asfollows:X = W 1/2 · Z where W ∼ Sα/2

((cos(πα/2))2/α, 1, 0

),

Z ∼ N(0,Q)We need to approximate the distribution function.Assumption 3. It is complicated due to the spectral measure. It isan operator stable distribution.In the following slides, we will comment what follows fromthese assumptions

Page 19: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices

Assumption 1. The tail index α of the residuals of windproduction equals 1.651 and the tail index of the residuals of pricesequals 1.562.Assumption 2. The classical correlation is equal to 45%. Thedependence parameter between the residuals under assumptionthat the joint distribution equals 63%. The tail index is 1.61. Weneed to approximate the distribution function.

Page 20: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Assumptions on the Residuals of Autoregressive Models ofWind Production and Prices

Assumption 3. Any univariate stable distribution can besimulated by means of exponential and uniform distributions. Inour case it looks as follows:If W (α, exp(1), U(−π/2, π/2)) = Sα/2(cos(πα/2)2/α, 1, 0)

Any state is a two-dimensional vector S = (Price, Wind)T

Xprice = W (αprice)1/2 · Z , Xwind = W (αwind)1/2 · Z

X ∗ = (Xprice1, Xwind2)In this case, we will approximate the distribution function bymeans of empirical distribution function because it convergesuniformly to the true distribution function.

Page 21: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Thw knowledge of the distribution function enables us to calculatethe transition matrix.

For each current state s = (p, y) and each following states ′ = (p′, y ′) we have that

P(s ′|s) = P(p′, y ′|p, y) = P(εp

D = p′ − ap · p, εy

D = y ′ − ay · y)and∀s, ∑

s′ P(s ′|s) = 1

Page 22: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Thw knowledge of the distribution function enables us to calculatethe transition matrix.

For each current state s = (p, y) and each following states ′ = (p′, y ′) we have that

P(s ′|s) = P(p′, y ′|p, y) = P(εp

D = p′ − ap · p, εy

D = y ′ − ay · y)and∀s, ∑

s′ P(s ′|s) = 1

Page 23: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Thw knowledge of the distribution function enables us to calculatethe transition matrix.

For each current state s = (p, y) and each following states ′ = (p′, y ′) we have that

P(s ′|s) = P(p′, y ′|p, y) = P(εp

D = p′ − ap · p, εy

D = y ′ − ay · y)and∀s, ∑

s′ P(s ′|s) = 1

Page 24: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

The following results will be demonstrated only for Assumption 1.

Page 25: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Approximating value functions by iterations:

Step 0.Set v0(s) = 0, ∀s ∈ S .fix a tolerance parameter ε > 0.Set n = 1.Step 1. For each s ∈ S compute:V n(s) = maxx∈X

(C (s, x) + γ

∑s′∈S P(s ′|x , s)V n−1(s ′)

)(1)

let xn be the decision vector that solves equations (1).Step 2. If |vn − vn−1| < ε(1− γ)/2γ, let xπ be the resultingpolicy that solves (1), and let v ε = vn and stop. (| · | denoted themaximum norm) Else set n = n + 1 and go to step 1.

Page 26: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

THEOREM 1.

If we apply the value iteration algorithm with stopping parameter εand the algorithm terminates at iteration n with value functionvn+1, then

|vn+1 − v∗| ≤ ε/2.

Page 27: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Formulation of the Problem

Discount Factor:We have chosen the value γ = 0.08.Reward Function:if xt > y ′t+1 thenCt(St , xt , St+1) = (yt+1 + c−)pt+1 − u · pt+1(xt − c− − yt+1).

if xt ≤ y ′t+1 thenCt(St , xt , St+1) = xtpt+1 + o · pt+1(yt+1 − xt − c+).Transition MatrixAssumption 1.Discretization:25 values of wind and 25 values of prices (625 states).

Page 28: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Value Iteration. Difference between 20-th iteration and 21-st

Page 29: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

After 21-st iteration we have that |v21 − v20| = 41.7. ByTHEOREM 1. we have |v∗ − v21| ≤ 3.62609. (v∗ is the optimalvalue)

But v1, v2, ..., v21 are measured in millions!

Page 30: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Application of Random Forests to Estimate Value Function

We apply random forests to estimate value function afterreformulation of the problem in terms of post-decision variables.

We used the value function obtained by value iteration as abenchmark.We express the value function as a function of price,WindProduction, price2, WindProduction2,price ·WindProduction, price2 ·WindProduction2. In the case ofregression and instrumental variables it will be a linear function ofthese variables. This approximation yields the similar results andRandom Forests outperform regression and instrumental variables.In the case of instrumental variables, the relative error is just 2.5percent and in the case of random forests, it is 2.1 percent.

Page 31: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Application of Random Forests to Estimate Value Function

We apply random forests to estimate value function afterreformulation of the problem in terms of post-decision variables.

We used the value function obtained by value iteration as abenchmark.We express the value function as a function of price,WindProduction, price2, WindProduction2,price ·WindProduction, price2 ·WindProduction2. In the case ofregression and instrumental variables it will be a linear function ofthese variables. This approximation yields the similar results andRandom Forests outperform regression and instrumental variables.In the case of instrumental variables, the relative error is just 2.5percent and in the case of random forests, it is 2.1 percent.

Page 32: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Random Forests versus Regression

Page 33: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Random Forests versus Instrumental variables.

Page 34: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

Figure: Software used for implementing dynamic programming

Page 35: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

FURTHER RESEARCH

1) To reduce simplifying assumptions.2) To combine the technique of ADP with techniques of predictionof prices.3) To implement ADP for Assumption 2. and Assumption 3.4) To handle only one-sided dependence structure: wind can affectprices but not vice versa.5) To use bidding strategies that follow from the improved modelfor trading purposes.

Page 36: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

BIBLIOGRAPHY

1) L. Breiman. Random Forests. Statistics Department. Universityof California Berkeley, CA 94720. January 2001.2) N. Lohndorf, S. Minner. Optimal Day-Ahead Trading andStorage of Renewable Energies - An Approximate DynamicProgramming Approach. Department of Business Administration,University of Vienna. December 2009.3) W.R. Scott, W.B. Powell. Approximate Dynamic Programmingfor Energy Storage with New Results on Instrumental Variablesand Projected Bellman Errors. Submitted to Operations Research.4) S. Snih. Random Forests for Classification Trees andCategorical Dependent Variables: an informal Quick Start R Guide.Stanford University. February 2011.

Page 37: Vadym Omelchenko Model of Approximate Dynamic Programming

IntroductionValue Function

Calculating Transition MatrixApproximate Dynamic Programming

THANK YOU FOR YOUR ATTENTION!