On parallelizing dual decomposition in stochastic integer programming Cosmin Petra Mathematics and...
If you can't read please download the document
On parallelizing dual decomposition in stochastic integer programming Cosmin Petra Mathematics and Computer Science Division Argonne National Laboratory,
On parallelizing dual decomposition in stochastic integer
programming Cosmin Petra Mathematics and Computer Science Division
Argonne National Laboratory, USA Joint work with Miles Lubin (MIT),
Burhaneddin Sandiki, Kipp Martin (U Chicago) INFORMS Annual Meeting
Oct 2012
Slide 2
Overview Block-angular structure Motivation: stochastic
optimization of the power grid Revisiting dual decomposition
algorithm of Care and Schultz Parallelizing the solution to
Lagrangian dual Parallel numerical experiments 2
Slide 3
Stochastic Formulation Discrete distribution leads to
block-angular (MI)LP 3
Slide 4
Large-scale (dual) block-angular LPs 4 In terminology of
stochastic LPs: First-stage variables (decision now): x 0
Second-stage variables (recourse decision) : x 1, , x N Each
diagonal block is a realization of a random variable (scenario)
Extensive form
Slide 5
Stochastic Optimization and the Power Grid Unit Commitment:
Determine optimal on/off schedule of thermal (coal, natural gas,
nuclear) generators. Day-ahead market prices. (hourly)
Mixed-integer Economic Dispatch: Set real-time market prices.
(every 5-10 min.) Continuous Linear/Quadratic Challenge: Integrate
energy produced by highly variable renewable sources into these
control systems. Minimize operating costs, subject to: Physical
generation and transmission constraints Reserve levels Demand
5
Slide 6
Stochastic unit commitment with wind power Scenarios obtained
using numerical weather prediction codes Real-time grid-nested 24h
parallel simulation using WRF 6 Slide courtesy of V. Zavala &
E. Constantinescu Wind farm Thermal generator
Slide 7
Computational challenges and difficulties in power grid May
require many scenarios (100s, 1,000s, 10,000s ) to accurately model
uncertainty Large scenarios ( W i up to 100,000 x 100,000) Large 1
st stage (1,000s, 10,000s of variables) Easy to build a practical
instance that requires 100+ GB of RAM to solve Requires distributed
memory Integer constraints Real-time solution needed in our
applications 7
Slide 8
Dual decomposition - formulation Feasibility sets Extensive
form in this notation is Split-variable formulation
Non-anticipativity constraints are explicitly enforced 8
Slide 9
Dual decomposition (Car e and Schultz, 1999) Apply Lagrangian
relaxation (LR) to non-anticipativity constraints For mixed-integer
problems, LR provides a lower bound on the optimal value Branch and
bound on non-anticipativity variables 9
Slide 10
Dual decomposition computational appeal Relaxing
non-anticipativity decouples the scenarios Typically better lower
bound than from LP relaxation Lagrangian dual has similar
computational pattern to Benders At each iteration, solve a
continuous master problem and an independent MILP for each scenario
Trivially parallel? Not quite, there are interesting algorithmic
and computational questions, as we'll see. To our knowledge, no
previously published parallel implementations, although scope for
parallelism has been observed 10
Slide 11
State variable formulation of non-anticipativity Caroe and
Schultz consider r-1 constraints in the form We use state variable
formulation with r constraints (Sen, 2005) Lagrangian dual problem
can be stated as Will see later why this formulation is useful for
parallel computations 11
Slide 12
Characterization of the solution Proposition (Care and Schultz)
Optimal objective of Lagrangian dual is that of a partially
convexified LP relaxation Theoretical equivalence with Lulli and
Sens (2004) branch-and-price (column generation) However, solving
the Lagrangian dual also gives a solution to (1) (useful in branch-
and-bound) 12 The optimal value equals the optimal value of the
linear program
Slide 13
Optimization of the Lagrangian Dual Each is concave and
non-differentiable For a fixed, is a subgradient of, where By
solving/evaluating (mixed-integer LP), we get at least one
subgradient for free Care and Schultz suggests using proximal
bundle black-box solvers. Alternatives are other variants of
cutting-plane algorithms (boxstep or level regularization) 13
Slide 14
Lets open the proximal bundle black box Proximal bundle QP
master - is the trial point, the regularization parameter The dual,
below, is typically advantageous for computation The proximal
bundle master QP has a (dual) block-angular structure 14
Slide 15
Dual of Proximal Bundle QP Master 15 Block-angular structure
Direct result of equality-constrained formulation, doesn't hold for
other formulations of non-anticipativity. But also applies to other
forms of regularization.
Slide 16
Computational significance The master QP is overall sparse (but
may be block dense). Dense QP solvers, which are typically used in
black-box proximal bundle codes, can't efficiently solve it. Master
QPs structure can be exploited: PIPS (Petra) or OOPS(Gondzio) With
the ability to solve the master in parallel, we address a serial
bottleneck of execution. Now, both MILP subproblems and QP master
can be solved in parallel. Greater potential for parallel speedup
(Amdahl's law) 16
Slide 17
Numerical experiments Implementation in C++ using MPI. Looking
at solving Lagrangian dual, no branching implemented SCIP used for
MILP subproblems. No compression of bundle (removing
cuts/subgradients) Serial experiments Cutting plane vs. black-box
proximal bundle vs. our implementation Parallel experiments
Scalability of full proximal bundle algorithm More detailed look at
scalability of parallel QP solver 17
Slide 18
Test instances Stochastic mixed-integer instances dcap and sslp
from SIPLIB by Shabbir Ahmed Stochastic LP product instance from
Huseyin Topaloglu 18
Slide 19
Test architecture Fusion high-performance cluster at Argonne
320 nodes InfiniBand QDR interconnect Two 2.6 Ghz Xeon processors
per node (total 8 cores) Most nodes have 36 GB of RAM, some have 96
GB We use 1 MPI process (= parallel process) per core 19
Slide 20
Serial experiments OOQP - General sparsity-exploiting QP IPM
solver PIPS - Specialized IPM solver for block-angular structure
ConicBundle - Open-source off-the-shelf proximal bundle code 20
Time limit 7200 seconds
Slide 21
Parallel experiments serial master, parallel subproblems
21
Scalability of master QP Scope for parallelism in solving
master QP depends on number of linking variables (= number of
first-stage variables) being small relative to diagonal blocks (=
subgradient cuts per scenario). Stochastic integer test problems
have very small first stage (order of 10). Important to consider
performance on larger first stage for practical problems.
Stochastic LP product instances used for this (1,000 scenarios) We
look only at QP solves, not proximal bundle convergence. 23
Slide 24
Small first stage 24
Slide 25
Medium first stage 25
Slide 26
Large first stage 26
Slide 27
Energy application stochastic unit commitment State of Illinois
power grid 12-hour horizon 64 scenarios 3,621,180 vars. 3,744,468
cons. 3,132 binary LP Relaxation objective: 939,208 LP Relaxation +
CglProbing cuts: 939,626 Feasible solution (rounding): 942,237
Optimality gap: 0.27% (0.5% is acceptable in industry practice)
Lagrangian relaxation: 941,176 Feasible solution (rounding):
943,351 Optimality gap: 0.23% (combined: 0.11%) 27
Slide 28
Conclusions and future work Revisited dual decomposition from
the perspective of parallel computation, addressed bottleneck of
solving master Dual decomposition promising approach for parallel
solution of stochastic mixed-integer programs More work needed to
address load imbalance, perhaps asynchronism as in Linderoth and
Wright Branch and bound implementation Large scale computational
study for stochastic unit commitment 28
Slide 29
29 Supplier Consumer Distribution System Operator ON/OFF
Generation Levels Demand Weather