16
Energy Syst DOI 10.1007/s12667-013-0110-4 ORIGINAL PAPER Adaptive monitoring of the progressive hedging penalty for reservoir systems management Luckny Zéphyr · Pascal Lang · Bernard F. Lamond Received: 12 November 2013 / Accepted: 26 November 2013 © Springer-Verlag Berlin Heidelberg 2013 Abstract Reservoir systems operations problems are in essence stochastic because of the uncertain nature of natural inflows. This leads to very large stochastic models that may not be easy to handle numerically. In this paper, we revisit the decomposi- tion method developed by Rockafellar and Wets (Math Oper Res 119–147, 1991) by proposing new heuristics to initialize and dynamically adjust the penalty parameter of the augmented Lagrangian function on which this method is based. The heuristics are tested on multi-reservoir problems generated randomly and compared with the traditional strategy of setting the penalty parameter to a fixed value. Keywords Reservoir systems operations · Stochastic programming models · Decomposition method · Heuristic · Progressive hedging algorithm 1 Introduction The management of hydroelectric production is usually concerned with release and storage policies for a possibly large network of reservoirs. These policies are based on available information on current stocks and predictions of future inflows and load This research was supported in part by the National Science and Engineering Research Council of Canada, under Grant 0105560. L. Zéphyr (B ) · P. Lang · B. F. Lamond Operations and Decision Systems Departement, Université Laval, Quebec, Canada e-mail: [email protected] P. Lang e-mail: [email protected] B. F. Lamond e-mail: [email protected] 123

Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Embed Size (px)

Citation preview

Page 1: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Energy SystDOI 10.1007/s12667-013-0110-4

ORIGINAL PAPER

Adaptive monitoring of the progressive hedging penaltyfor reservoir systems management

Luckny Zéphyr · Pascal Lang ·Bernard F. Lamond

Received: 12 November 2013 / Accepted: 26 November 2013© Springer-Verlag Berlin Heidelberg 2013

Abstract Reservoir systems operations problems are in essence stochastic becauseof the uncertain nature of natural inflows. This leads to very large stochastic modelsthat may not be easy to handle numerically. In this paper, we revisit the decomposi-tion method developed by Rockafellar and Wets (Math Oper Res 119–147, 1991) byproposing new heuristics to initialize and dynamically adjust the penalty parameterof the augmented Lagrangian function on which this method is based. The heuristicsare tested on multi-reservoir problems generated randomly and compared with thetraditional strategy of setting the penalty parameter to a fixed value.

Keywords Reservoir systems operations · Stochastic programming models ·Decomposition method · Heuristic · Progressive hedging algorithm

1 Introduction

The management of hydroelectric production is usually concerned with release andstorage policies for a possibly large network of reservoirs. These policies are basedon available information on current stocks and predictions of future inflows and load

This research was supported in part by the National Science and Engineering Research Council ofCanada, under Grant 0105560.

L. Zéphyr (B) · P. Lang · B. F. LamondOperations and Decision Systems Departement, Université Laval, Quebec, Canadae-mail: [email protected]

P. Lange-mail: [email protected]

B. F. Lamonde-mail: [email protected]

123

Page 2: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

demands that are in essence stochastic. In multistage contexts, this information isperiodically updated. Storage and release policies are then based on a very large-dimensional, potentially infinite, set of possible outcomes.

Finding good policies depends on two complementary tasks : (a) given an approx-imate description of the random processes, find an optimal storage and release policy(an optimization phase) (b) find an acceptable approximation of the underlying sto-chastic process, where “acceptable” is meant as a compromise between sample scaleand sampling error (an estimation phase). The interface between these two aspects is achallenging field of investigation. In classical approaches (e.g. [6]) the estimation andoptimization phases obey independent performance criteria. By contrast, the approachof Kaut and Wallace [8] proposes a common performance evaluation basis for bothestimation and optimization phases, an important step towards overall consistency.

This paper is concerned with the optimization phase alone, taking as given a partic-ular discrete estimation of the underlying stochastic inflows and load processes. Ourinterest is motivated by challenges currently experienced by hydroelectric utilities,in several parts of the world, in just solving very large scale stochastic optimizationproblems in this framework. Particularly, experiments with the progressive hedgingalgorithm (PHA, [14]) are often aborted due to excessive run-time. Several ideas havebeen advanced toward accelerating the PHA, among which adjusting the penalty para-meter(s), but none to our knowledge has led to marked progress.

This paper investigates an “adaptive” version of the PHA, as an alternative to theclassical version. We show that, in principle, the classical algorithm can be improvedin terms of (a) convergence rate, and (b) number of iterations. Section 2 states a generalhydroelectric optimization problem arising in the context of a given finite event tree.Sections 3 and 4 discuss possible decomposition approaches, among which the PHA.Section 5 reports known variants of the PHA concerned with adjusting the penaltyparameter; it then states our adaptative scheme for initializing and controlling thisparameter. Section 6 reports computational experiments and their results. Concludingremarks follow.

2 Optimization of reservoir systems management

We consider a multi-reservoir and multi-stage hydrothermal stochastic optimizationproblem. At each period, a volume of water is discharged from each reservoir and sentthrough turbines to produce electricity. We assume that each reservoir is equippedwith a spillway system, so the water surplus is discharged in a downstream reservoiror out of the system.

The state of the system is described by the vector of water volumes stored in eachreservoir at the end of period t denoted by vt , and natural inflows during period t aredenoted by Rt . At each period, two decisions have to be made: the volume of waterto discharge from each reservoir for energy production that we denote by qt , and thevolume of water to spill denoted by yt .

We model the reservoir and the spillway systems by a directed graph, where eachnode stands for a reservoir and each arc a river. The arcs have the same label as thenodes they leave. Let us denote by

123

Page 3: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

Fig. 1 Example of an event tree

B, the reservoir-river incidence matrix,m, the number of reservoirs, andC, the reservoir-spillway incidence matrix.

B and C are mxm. We note Bi j = 1, if arc j leaves node i, and Bi j = −1 , if arc jenters node i. Similarly for matrix C. The dynamics of the system are governed by thewater balance equation

vt = vt−1 − Bqt − Cyt + Rt (1)

Natural inflows and energy demand are uncertain. The information structure of thisuncertainty is described by a given arborescence named event tree (Fig. 1). With thisstructure, decision variables, constraints and the objective are divided into stages.

Each node of this tree corresponds to a particular realization of the random variables,and hence can be viewed as a state of information (or state, for short). States are indexedby k ∈ K . Let a(k) be the unique predecessor of node k, and S be the set of terminalstates, that is, nodes without successors. We denote by Sk the subset of terminal statesof the subtree with root k. We associate with each node k a time period tk . The terminalstates are associated with the end-horizon T . Finally, let K ′ = {k ∈ K ||Sk | > 1} bethe set of nodes which lead to more than one terminal state.

The overall optimization problem considered here seeks to minimize the expectedhydrothermal operation cost over a planning horizon of T periods. As in Pereira andPinto [11], with each period is associated a thermal cost (fuel, etc.) and a penalty costfor not being able to satisfy all the energy demand for this period (load shedding cost).The model parameters are defined as follows.

pk : Probability of state k;dk : Load in state k;

123

Page 4: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

Rk : Natural inflows in state k;v: Lower bounds on the storage volumes;v: Upper bounds on the storage volumes;q: Lower bounds on turbined water;q: Upper bounds on turbined water;ϑ : Technical coefficients that convert the turbined water into energy.

With the model are also associated the following decision variables:

Gk ∈ �+: Thermal generation in state k;Zk ∈ �+: Load shedding in state k;vk ∈ �m+: Stored volumes in state k;qk ∈ �m+: Water turbined in state k;yk ∈ �m+: Water spilled in state k.

Then the overall stochastic problem (P) reads

Min∑

k∈K

pk fk(Gk, Zk)

S.t. :Gk + Zk + ϑ ′ qk = dk k ∈ K (2)

vk = va(k) − Bqk − Cyk + Rk k ∈ K (3)

v ≤ vk ≤ v k ∈ K (4)

q ≤ qk ≤ q k ∈ K (5)

yk ≥ 0 k ∈ K (6)

where fk(Gk, Zk) is the operating cost in state k and is considered a linear functionof decision variables Gk and Zk . The equality constraints (2) and (3) are respectivelythe load balance equation and the water conservation equality, while the inequalityconstraints (4) and (5) are bounds on storage and flow variables.

3 Decomposition approaches

The size of the multi-stage stochastic problem illustrated by the scenario tree of Fig. 1can grow exponentially with T and S if one considers multiple realisations of thenatural inflows and of the load processes at each stage. Thus, the problem can becomevery difficult to solve due to the computational burden. One way to tackle this hurdleis to decompose the overall problem, by exploiting its structure, into much easier tosolve subproblems.

Several generic decomposition schemes have been proposed for multistage stochas-tic programs. See Birge and Louveaux [3], and Ruszczynski [15] for earlier accountsof state of the art and research perspectives. Earlier works focused on variations ofBenders decomposition [1], initially for two-stage stochastic linear programs (e.g.[17]), later extended to the multi-stage case (e.g. [2]; see also [12]) and to regularizedvariants. Under suitable pre-processing, Benders-type cutting plane methods are saidto remain competitive for fixed multiple recourse linear problems.

123

Page 5: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

A second strand of research stems from the seminal work of Rockafellar and Wets[14] on scenario decomposition. This is a variable splitting (or replication) schemewhere each subproblem is the deterministic program that would arise if one had perfectforesight of a “scenario”, i.e. a particular realization of the random variables’ trajecto-ries. The linking constraints, called “non-anticipative”, require that scenario decisionsbased on a common information history should be equal. These non-anticipativity con-straints are dualized and enforced by a “Progressive Hedging” (PH) algorithm. Thisapproach has been applied in a variety of contexts notably in finance and logistics(e.g. [9,18]), whereas very few applications exist in the context of reservoir opera-tions problems (see [5]).

Of common ancestry, operator splitting methods also apply the proximal pointalgorithm in a more general primal-dual setting. In the multi-stage stochastic pro-gramming context, Salinger and Rockafellar [16] propose a decomposition into event-based (rather than scenario-based) subproblems (thus saving duplicate scenario vari-ables), with a coordination mechanism using the efficient recursion of unconstrainedquadratic-linear dynamic programming. Pennanen and Kallio [10] offer some opera-tional improvements. However, a parameter akin to a penalty coefficient is invoked.No extensive experimentation is reported.

Finally, although not strictly falling under the caption of decomposition, interiorpoint methods offer potentially efficient alternatives for very large scale problems.For multistage stochastic linear problems, the challenge is exploiting the event treestructure in successive Newton steps. For primal interior point methods, Blomvall andLindberg [4] propose a recursive iteration scheme for a quadratic approximation ofthe Newton step.

4 Scenario-based decomposition and the progressive hedging algorithm

A scenario in the event tree is a simple directed path from the initial state to someterminal state. There are thus |S| distinct scenarios, each one corresponding to apossible sequence of realisations of the random variables. For instance, in the caseof the event tree of Fig. 1, there are seven scenarios as depicted below (Fig. 2). Thedashed lines connect scenarios that share the same trajectory up to period t .

Scenario decomposition uses duality theory to separate an equivalent problem (P′)into |S| independent deterministic optimization problems, one for each scenario.

Scenario decomposition is a form of variable splitting where state k’s variables areduplicated into |Sk | copies. These variable duplicates are now indexed by scenario andtime period.

Given any scenario s ∈ S and time period t , 1 ≤ t ≤ T , there is a unique statek = κ(s, t) such that s ∈ Sk and t = tk . We have thus a one-to-one correspondencek ↔ κ(s, t) . With this formulation the decision variables Gk , Zk , vk , qk and yk

associated with problem (P) are replaced respectively by

Gst ∈ �+: Thermal generation in period t under scenario s, 1 ≤ t ≤ T ;Zst ∈ �+: Load shedding in period t under scenario s, 1 ≤ t ≤ T ;vst ∈ �m+; qst ∈ �m+ and yst ∈ �m+.

123

Page 6: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

Fig. 2 Scenarios associated with the event tree of Fig. 1

In addition, consensus variables wk ∈ �m+, k ∈ K ′, will be useful for expressingthe non-anticipativity constraints (NA) below. Then the reformulation (P′) reads

Min∑

s∈S

ps

T∑

t=1

fst (Gst , Zst )

S.t.: Gst + Zst + ϑ ′ qst = dκ(s, t) s ∈ S, 1 ≤ t ≤ T (7)

vst = vs,t−1 − Bqst − C yst + Rκ(s, t) s ∈ S, 1 ≤ t ≤ T (8)

v ≤ vst ≤ v s ∈ S, 1 ≤ t ≤ T (9)

q ≤ qst ≤ q s ∈ S, 1 ≤ t ≤ T (10)

yst ≥ 0 s ∈ S, 1 ≤ t ≤ T (11)

vst = wκ(s,t) κ(s, t) ∈ K ′(N A) (12)

Constraints (7–11) are analogous to (2–6), while (12) are the (NA) constraints whichinsure that multiple scenario variables corresponding to a common node in the eventtree are equal. We apply these constraints (through variables wk) to stocks only, sincethey then trivially extend to flows.

Consider the augmented Lagrangian of problem (P′) where the NA constraints arepriced out and penalized in the objective:

L(X, W, U, ρ) =∑s∈S ps

[ ∑Tt=1 fst (Gst , Zst )

+∑t :κ(s,t)∈K ′

(u′st

(vst − wκ(s,t)

)+ ρ2

∣∣∣∣vst − wκ(s,t)∣∣∣∣2

) ] (13)

where

X = {Xs |s ∈ S}Xs =

{(Gst , Zst , vst , qst , yst

)|1 ≤ t ≤ T

}, s ∈ S

123

Page 7: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

W = {wk |k ∈ K ′

}

U =⎧⎨

⎩ust |κ(s, t) ∈ K ′,∑

s:κ(s,tk )=k

us,tk = 0 ∀ k ∈ K ′⎫⎬

⎭ .

and ρ > 0 is the (scalar) penalty parameter.With (13) is associated the dual problem

MaxU I n fX,W L(X, W, U, ρ) (14)

Expression (13) is not separable by scenarios because of the coupling variablesW in the quadratic terms. One way to circumvent this difficulty is by way of theProximal Point scheme of block-coordinate minimization, sequentially in X then inW . The PHA of Rockafellar and Wets [14] is one implementation of this scheme. Eachiteration (numbered ν) of PHA comprises three steps, as shown in Table 1.

Finally, let us note that the PHA easily accommodates multiple penalty coefficientsassociated with distinct classes of variables. This active line of research will not bepursued here.

5 Adapting the penalty parameter

Given a fixed ρ, in the convex case, the PHA is assured to converge globaly to anoptimal solution if one exists. In the case of a unique saddle-point, the convergencerate is linear. In other cases, however, this algorithm is an instance of sub-gradient opti-misation, with possibly sub-linear convergence. Slow convergence is indeed reportedin several applications.

Table 1 Progressive hedging algorithm

Step 1: For fixed W , U and ρ, minimize the augmented Lagrangian with respect to X . That is, find an

Xν ∈ ArgminX

{L(X, W ν−1, Uν−1, ρ)|s.t.(7)− (11)

}.

For fixed W , the augmented Lagrangian L is separable by scenarios, so that this minimization translatesinto: For all s ∈ S, find an

Xνs ∈ ArgminXs

{Ls (Xs , W ν−1, Uν−1

s , ρ)|s.t.(7)− (11)for scenario s}

,

where

Ls (Xs , W ν−1, Uν−1s , ρ) = ps

[∑Tt=1 fst (Gst , Zst ) +

∑t :κ(s,t)∈K ′

(U ′st

(vst − wκ(s,t)

)+ ρ2

∣∣∣∣vst − wκ(s,t)∣∣∣∣2

) ].

(15)

Step 2: For fixed X = Xν , Uν−1 and ρ, minimize the augmented Lagrangian with respect to W . Thistrivially results in

wνk =

∑s∈Sk

ps vνs,tk

pk, k ∈ K ′.

Step 3: Perform one dual ascent step towards the maximization of the dual functionD(U, ρ) ≡ minX,W

{L(X, W, U, ρ)

}through the following dual variables update

uνst = uν−1

st + ρ(vν

st − wνκ(s,t)

), κ(s, t) ∈ K ′.

123

Page 8: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

This paper is concerned with the performance of the PHA, as measured by conver-gence rate and number of iterations. The overall performance of the PHA is known tobe sensitive to the choice of the penalty parameter. The latter scales the penalty termin the augmented Lagrangian and serves as a step length in the updating of the dualvariables. High values of this factor may cause oscillations in the dual variables andlittle movement in the penalty term. On the other hand, small values of this parametermay guarantee good quality solutions, but may slow down convergence in the dualvariables (see [9]). It is apparent that a suitable value for ρ depends on contextualelements. However, no further theory is available.

Several heuristics have been proposed to adapt the penalty parameter. Watson etal. [18] associate a distinct penalty parameter to each variable involved in NA con-straints. This allows them to fix parameter values in some proportion of unit costs.Other authors let parameter values change over iterations. Although convergence is notguaranteed any more, these authors claim improved empirical convergence when usingρ-trajectories (in iteration space) of various shapes. Thus, Mulvey and Vladimirou [9]mention a concave increasing trajectory, while also stating that decreasing ρ near theoptimum accelerates convergence.

Reis et al. [13] propose a convex decreasing trajectory, where the parameter isforced to take on large values at the initial steps to faster convergence and small valuesat the final ones “to guarantee decisions convergence”. In the spirit of Gonçalves et al.[7], Watson et al. [18] test both decreasing and increasing trajectories in the iterationsspace to update the parameters.

In the aforementioned proposals the penalty parameter is adjusted according toa fixed trajectory, rather than dynamically as a function of the solution state. Suchtrajectories are governed by parameters which cannot easily be related to the problem’scontext. This points to a theoretical pitfall, namely that the augmented Lagrangianmixes a cost function and a penalty function expressed on fundamentally differentscales. In the spirit of bi-criteria optimization, care should be exercised regarding anytype of inter-scale comparison.

5.1 A learning heuristic for adjusting the penalty parameter

We seek to control ρ in an adaptive “learning” fashion. To that end, we track thefollowing indicators:

θν =∑

s,t :κ(s,t)∈K ′ps

∣∣∣∣∣∣vst − wν

κ(s,t)

∣∣∣∣∣∣2

δν =θν +∑

k∈K ′pk

∣∣∣∣∣∣wν

k − wν−1k

∣∣∣∣∣∣2

where ν is an iteration count. θν represents a non-anticipativity gap at iteration ν. Underconstant ρ, the sequence {δν} converges monotonically to 0 (δ = 0 is in fact a necessaryand sufficient condition for attaining optimality). Hence δν will be interpreted as an

123

Page 9: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

Table 2 Learning heuristic for ρ

τν = δν

δν−1

γ ν = max{0.1, min

{0.9, τ ν − 0.6

}}

σν = (1− γ ν

)σν−1 + γ ντν

gν = √1.1σν

αν = 0.8αν−1 + 0.2 θν

δν

bν = 0.98bν−1 + 0.02αν

cν = max{

0.95, 1−2bν

1−bν

}

hν = max{

cν + 1−cν

bν αν, 1+ αν−bν

1−bν

}

qν = (max

{gν , hν

}) 11+0.01(ν−2)

ρν = max{

0.01, min{

100, ρν−1qν}}

0.7 0.8 0.9 1 1.1 1.2 1.3 1.40.7

0.8

0.9

1

1.1

1.2

1.3

1.4

τv

σv

σ v−1 = 1.1

σ v−1 = 0.8

Fig. 3 Progress indicator σν as a function of τν

optimality gap. It should be noted however that as ρ varies with iterations, the sequence{δν} is not guaranteed to be monotonic.

Our learning heuristic entails an updating rule of the form ρν = qνρν−1 where theresponse qν is guided by two indicators: (a) a “progress” indicator reflecting an averagecontraction rate of the optimality gap, and (b) a “balance” indicator reflecting an aver-age proportion of the optimality gap accounted for by the non-anticipativity gap. Bothindicators are dimensionless ratios, thus avoiding previously discussed measurementpitfalls. A particular implementation of this heuristic is summarized in Table 2.

The progress indicator σν is a non-linearly smoothed average of the instant progressrate τ ν = δν/δν−1. The smoothing formula is more reactive as τ ν increases, thusproviding early warning of misdirection as can be seen by the increasing slope in Fig. 3.

123

Page 10: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.8

1

1.2

1.4

1.6

1.8

2

αv

hv

b v = 0.25

b v = 0.5

Fig. 4 Stabilized balance indicator hν as a function of αν

The balance indicator αν is compared to a reference tracking target bν , whichrepresents a “desired” proportion of the non-anticipativity gap in the overall optimalitygap. This target is initially set at 0.5, then very slowly adjusted according to our “long-term” experience of feasible αs (initialized at 1). A value αν/bν > 1 suggests that theNA constraints need some tightening by increasing ρ. A value less than one suggestsan opposite move, though quite dampened to prevent primal instability (see Fig. 4).

5.2 Initialization of the penalty parameter

To initialize ρ, we consider a form of “variance decomposition” of the augmentedLagrangian function (13), which is the sum of variations in the cost function and theNA constraint violations. With the dual variables initialized to zero, this variation iswritten

�F =∑

s∈S

ps

⎣�

T∑

t=1

fst (Gst , Zst )+ ρ

2�

t :κ(s,t)∈K ′

∣∣∣∣vst − wκ(s,t)∣∣∣∣2

⎦ .

To estimate the first part of the above expression we solve the deterministic problemassociated with problem (P) taking at each time period the expected values of thenatural inflows and of the load. Denote by E(P) the objective value of this deterministicproblem. For each scenario s, we also solve the linear problem

min

{T∑

t=1

fst (Gst , Zst )|s.t.(7)− (12)

}.

123

Page 11: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

Denote by Os the objective value of this problem. Then �∑T

t=1 fst (Gst , Zst ) isapproximated by |Os − E(P)|.

We also approximate �∑

t :κ(s,t)∈K ′∣∣∣∣vst − wκ(s,t)

∣∣∣∣2 by

t :κ(s,t)∈K ′

∣∣∣∣∣∣v∗st − w∗κ(s,t)

∣∣∣∣∣∣2,

where the v∗st are solutions to the linear problems, and w∗κ(s,t) are computed as in step2 of the PHA. Then we roughly have

�F ≈∑

s∈S

ps

⎣|Os − E(P)| +∑

t :κ(s,t)∈K ′

∣∣∣∣∣∣v∗st − w∗κ(s,t)

∣∣∣∣∣∣2

⎦ .

Letting

a =∑

s∈S

ps |Os − E(P)| and b =∑

s∈S

ps

t :κ(s,t)∈K ′

∣∣∣∣∣∣v∗st − w∗κ(s,t)

∣∣∣∣∣∣2,

we have �F = a + ρ2 b. We could choose ρ in such a way that

βa = (1− β)ρ

2b, 0 < β < 1,

where β is a given weight.For instance if β = 0.3, we want the variation in the cost function to be 70 % of the

total variation. This value of β will be used for the numerical experiments. From theabove equation we have

ρ = 2βa

(1− β) b.

For stability purposes, the last formula is moderated to

ρ = 2βa

1+ (1− β) b.

6 Numerical experiments

We recorded the performances of the adaptive and fixed penalty PH algorithm on 60test problems with varied characteristics.This experimental framework is presentedin Sect. 6.1. For each problem of the test set, we ran the adaptive algorithm (here-after superscripted with an A) and the classical (fixed parameter) version, hereafterassociated with a subscript C . Details of the analysis are provided in Sect. 6.2.

123

Page 12: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

Table 3 Simulation plan

m Approx.# scenarios

# Replicationsper case

# Problems

2 and 3 100 5 10

2 and 3 300 5 10

4 and 5 500 5 10

4 and 5 900 5 10

6 and 7 1,300 5 10

6 and 7 1,800 5 10

Total _ _ 60

Fig. 5 Reservoir configurations

6.1 Test set

The 60 test problems were constructed via Monte-Carlo Simulation under variousconditions of problem horizon, reservoir network, event tree, and inflows/demandsrealizations. The overall simulation plan is depicted in Table 3 and Fig. 5, m denotingthe number of reservoirs.

Event trees are simulated as follows. Let m be the number of reservoirs, nst theaverage number of successors of a node k at period t , K the set of the event tree nodesand Kt the set of nodes at period t . As of the second period, for each node k belongingto the set Kt−1, we draw a random integer of successors to this node from a uniformdistribution over the interval (nst − 1, nst + 1). The drawn nodes are mapped to theirancestors and added to the list Kt . This procedure is summarized in Table 4.

123

Page 13: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

Table 4 Procedure to construct a scenario tree

Root node:

k ← 1

K1 = {1}For all t = 2, . . . , T

Kt = ∅For k′ ∈ Kt−1

Draw an integer nst ≈ nst

For h = 1, . . . , nst

k ← k + 1

a(k) = k′Add k to Kt

Table 5 Possible realizations of the natural inflows and of the load

Natural inflows

For all k ∈ K

For all i = 1, . . . , m

Draw Rik =

⎧⎪⎨

⎪⎩

1.2vi with probability1/3

0.6vi with probability1/3

0 with probability1/3Load

For all k ∈ K

Draw dk =

⎧⎪⎨

⎪⎩

1.1∑m

i=1 vi ϑi with probability1/3

0.8∑m

i=1 vi ϑi with probability1/3

0.1∑m

i=1 vi ϑi with probability1/3

Realizations of demands and water inflows are drawn independently, as a challengeto our algorithms. Indeed, the NA constraints are more difficult to meet as the scenariosare more contradictory. If the algorithm converges reasonably in such situations, itcould be expected to exhibit still better behaviour in more realistic ones with oftenpositive spatial and serial correlations. For each node of the tree, we draw contrastedvalues representing high, moderate and no inflows as relative to reservoir capacities.Table 5 summarizes this procedure.

6.2 Analysis and results

We implemented both adaptive (A) and classical (C) versions of the PH algorithmwith Matlab 7.11.0, and the quadratic subproblems were solved with Cplex 12.2. Foreach problem i we ran the adaptive algorithm first, and then used the average ρ valueover this run as the fixed ρ value of the classical algorithm. This we felt gave the lattera fair edge, since no guidelines are available for choosing this value.

123

Page 14: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

We measured for each problem i the number of iterations ν Ai and νC

i and the average

convergence rates κ Ai and κc

i , defined as κi =(δi,νi /δi,1

)1/νi . We used as stoppingcriterion a threshold on the optimality gap of δi,νi ≤ 0.01. Let us define

N Si number scenarios in problem i .C Oi = κ A

i /κCi “relative” convergence rate of algorithm A in reference to algorithm

C .RNi = ν A

i /νCi “relative” number of iterations of algorithm A over algorithm C .

In order to mitigate possibly spurious results due to Monte Carlo draws, two stepswere taken: elimination of outliers, and problem perturbation. Since robustness is oftenassociated with scenario richness, we ran two univariate regressions of C Oi againstN Si and of RNi against N Si . Problem instances with absolute residuals larger than 1.5standard errors in either regression were discarded. The remaining sample contained46 problem instances. The average number of scenarios tested ranged approximatelyfrom 107 to 1,793 (see Table 6).

Table 6 provides, for each case defined in Table 3, the number of instances that wereretained, and some descriptive statistics pertaining to variables N Si , C Oi and RNi . Inpratically all cases, the adaptive algorithm outperformed the classical approach on bothconvergence rate and number of iterations. Observe that the average relative numberof iterations varied from 0.1864 to 0.9429, leading to an overall average of 0.3865,meaning that overall, our method reduced the number of iterations by approximately61 % as compared to the classical strategy. Furthermore, overall, at each iteration, onaverage, under our adaptive heuristic, the optimality gap was reduced approximatelyby 8 % more than under constant ρ.

Statistical tests indicate that with high significance E [C Oi ] < 1 and E [RNi ] < 1.

Table 7 provides one such example among many.

Table 6 Descriptive statistics

Case # inst. # scenarios C Oi RNi

Average Min Max Average Min Max Average Min Max

1 4 124.75 90 169 0.9344 0.9186 0.9481 0.3358 0.1705 0.5922

2 5 107.20 84 137 0.9405 0.9103 0.9604 0.3699 0.1502 0.6000

3 5 346.20 308 423 0.9331 0.9035 0.9520 0.3677 0.2317 0.5349

4 5 300.40 246 353 0.9458 0.9214 0.9685 0.3586 0.2355 0.5836

5 5 594.20 495 656 0.9498 0.9270 0.9673 0.4416 0.2637 0.6094

6 4 455.00 348 653 0.8462 0.7816 0.8882 0.1864 0.1631 0.2092

7 5 963.40 807 1,139 0.9195 0.9060 0.9455 0.2094 0.1317 0.2954

8 3 851.67 837 876 0.8395 0.7684 0.9546 0.4974 0.3034 0.8000

9 3 1,228.00 1,148 1,295 0.8465 0.8015 0.8750 0.2843 0.1626 0.3939

10 1 1,354.00 1,354 1,354 0.9692 0.9692 0.9692 0.9429 0.9429 0.9429

11 3 1,793.33 1,540 2,022 0.8998 0.8828 0.9136 0.5321 0.3427 0.8077

12 3 1,786.67 1,611 2,024 0.8952 0.8290 0.9543 0.6901 0.3404 1.0000

Overall 46 700.20 84 2, 024 0.9126 0.7684 0.9692 0.3865 0.1317 1.0000

123

Page 15: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

Adaptive monitoring of the progressive hedging penalty parameter

Table 7 Significance tests on means

Performance measure C O RN

Null hypothesis H0 E [C O] ≥ 0.93 E [RN ] ≥ 0.46

t statistic (45 df) −2.3048 −2.3029

Test conclusion Reject H0, (p value =1.3 %) Reject H0, (p value =1.3 %)

Table 8 Performance comparison of original versus perturbed problems

Performance measure C O RN

Null hypothesis H0 E [C O] = E [C O]P E [RN ] = E [RN ]P

t statistic (df) 0.609283 42.23 (df) −1.315171 (65.46 df)

Test conclusion Accept H0, (p value =54.6 %) Accept H0, (p value =18.8 %)

Finally, we introduced another test of stability involving problem perturbation. Ona randomly chosen subset of 29 of the 46 initial problem instances, each demandand random inflow was equiprobably increased by 10 % or decreased by 10 %. Weperformed a comparison test on means for non-homogeneous groups, as reported inTable 8 where superscripts P refer to perturbed problem instances.

Table 8 provides some reassurance as to the robustness of our earlier conclusions.Hence we may reasonably conclude that an adaptive scheme for controlling the penaltyparameter can enhance algorithmic performance of the PHA applied to hydroelectricproduction.

7 Conclusion

We have presented new heuristics to initialize and update the PHA penalty parameter inthe context of stochastic reservoir systems operations problems. In contrast to previousapproaches, our adaptive learning heuristic updates this parameter according to adynamic trajectory. The updating rule attempts to balance two concerns, namely overallconvergence and controlled reduction of the non-anticipativity gap.

Tests on 46 randomly generated problems of different sizes in the specific context ofhydroelectric energy generation suggest that dynamic updating may provide a markedimprovement in the solution process over a constant penalty parameter. We conjecturethat this approach could fairly easily be extended to cases of multiple penalty para-meters associated with different classes of variables. Integrating scenario generationinto this process may be an interesting future line of research.

References

1. Benders, J.: Partitioning procedures for solving mixed-variables programming problems. NumerischeMathematik 4, 238–252 (1962)

2. Birge, J.: Decomposition and partitioning methods for multistage stochastic linear programs. Oper.Res. 33, 989–1007 (1985)

123

Page 16: Adaptive monitoring of the progressive hedging penalty for reservoir systems management

L. Zéphyr et al.

3. Birge J, Louveaux F (1997) Introduction to Stochastic Programming. Springer-Verlag, New York(1997)

4. Blomvall, J., Lindberg, P.: A riccati-based primal interior point solver for multistage stochasticprogramming-extensions. Optimization Methods and Software 17(3), 383–407 (2002)

5. Dos Santos, M., Da Silva, E., Finardi, E., Gonçalves, R.: Practical aspects in solving the medium-termoperation planning problem of hydrothermal power systems by using the progressive hedging method.International Journal of Electrical Power & Energy Systems 31(9), 546–552 (2009)

6. Dupacovà, J., Gröwe-Kuska, N., Römisch, W.: Scenario reduction in stochastic programming - anapproach using probability metrics. Mathematical programming 95(3), 493–511 (2003)

7. Goncalves R., Finardi E., da Silva E.: Exploring the progressive hedging characteristics in the solutionof the medium-term operation planning problem. In: Proceedings of 17th Power Systems ComputationConference, PSCC, Stockholm, Sweden (2011)

8. Kaut, M., Wallace, S.: Evaluation of scenario-generation methods for stochastic programming. PacificJournal of Optimization 3(2), 257–271 (2007)

9. Mulvey, J., Vladimirou, H.: Applying the progressive hedging algorithm to stochastic generalizednetworks. Annals of Operations Research 31(1), 399–424 (1991)

10. Pennanen T, Kallio M (2006) A splitting method for stochastic programs. Ann. Oper. Res. 142, 259–268(2006)

11. Pereira, M., Pinto, L.: Stochastic optimization of a multireservoir hydroelectric system: A decompo-sition approach. Water Resources Research 21(6), 779–792 (1985)

12. Pereira, M., Pinto, L.: Multi-stage stochastic optimization applied to energy planning. MathematicalProgramming 52(1), 359–375 (1991)

13. Reis, F., Carvalho, P., Ferreira, L.: Reinforcement scheduling convergence in power systems transmis-sion planning. IEEE Transactions on Power Systems 20(2), 1151–1157 (2005)

14. Rockafellar, R., Wets, J.R.: Scenarios and policy aggregation in optimization under uncertainty.Math.Oper. Res. 16, 119–147 (1991)

15. Ruszczynski, A.: Some advances in decomposition methods for stochastic linear programming. Annalsof Operations Research 85, 153–172 (1999)

16. Salinger, D., Rockafellar, R.: Dynamic splitting: An algorithm for deterministic and stochastic multi-period optimization. Departement of Mathematics, University of Washington, Seattle, Working paper(2003)

17. Van Slyke R., Wets, R.: L-shaped linear programs with applications to optimal control and stochasticprogramming. SIAM J. Appl. Math. 17, 638–663 (1969)

18. Watson, J., Woodruff, D., Strip, D.: Progressive hedging innovations for a class of stochastic resourceallocation problems. Working paper, Sandia National Laboratories, Alburquerque (2008)

123