On parallelizing dual decomposition in stochastic integer programming Cosmin Petra Mathematics and Computer Science Division Argonne National Laboratory,

On parallelizing dual decomposition in stochastic integer programming Cosmin Petra Mathematics and Computer Science Division Argonne National Laboratory, USA Joint work with Miles Lubin (MIT), Burhaneddin Sandiki, Kipp Martin (U Chicago) INFORMS Annual Meeting Oct 2012

Overview Block-angular structure Motivation: stochastic optimization of the power grid Revisiting dual decomposition algorithm of Care and Schultz Parallelizing the solution to Lagrangian dual Parallel numerical experiments 2

Stochastic Formulation Discrete distribution leads to block-angular (MI)LP 3

Large-scale (dual) block-angular LPs 4 In terminology of stochastic LPs: First-stage variables (decision now): x 0 Second-stage variables (recourse decision) : x 1, , x N Each diagonal block is a realization of a random variable (scenario) Extensive form

Stochastic Optimization and the Power Grid Unit Commitment: Determine optimal on/off schedule of thermal (coal, natural gas, nuclear) generators. Day-ahead market prices. (hourly) Mixed-integer Economic Dispatch: Set real-time market prices. (every 5-10 min.) Continuous Linear/Quadratic Challenge: Integrate energy produced by highly variable renewable sources into these control systems. Minimize operating costs, subject to: Physical generation and transmission constraints Reserve levels Demand 5

Stochastic unit commitment with wind power Scenarios obtained using numerical weather prediction codes Real-time grid-nested 24h parallel simulation using WRF 6 Slide courtesy of V. Zavala & E. Constantinescu Wind farm Thermal generator

Computational challenges and difficulties in power grid May require many scenarios (100s, 1,000s, 10,000s ) to accurately model uncertainty Large scenarios ( W i up to 100,000 x 100,000) Large 1 st stage (1,000s, 10,000s of variables) Easy to build a practical instance that requires 100+ GB of RAM to solve Requires distributed memory Integer constraints Real-time solution needed in our applications 7

Dual decomposition - formulation Feasibility sets Extensive form in this notation is Split-variable formulation Non-anticipativity constraints are explicitly enforced 8

Dual decomposition (Car e and Schultz, 1999) Apply Lagrangian relaxation (LR) to non-anticipativity constraints For mixed-integer problems, LR provides a lower bound on the optimal value Branch and bound on non-anticipativity variables 9

Dual decomposition computational appeal Relaxing non-anticipativity decouples the scenarios Typically better lower bound than from LP relaxation Lagrangian dual has similar computational pattern to Benders At each iteration, solve a continuous master problem and an independent MILP for each scenario Trivially parallel? Not quite, there are interesting algorithmic and computational questions, as we'll see. To our knowledge, no previously published parallel implementations, although scope for parallelism has been observed 10

State variable formulation of non-anticipativity Caroe and Schultz consider r-1 constraints in the form We use state variable formulation with r constraints (Sen, 2005) Lagrangian dual problem can be stated as Will see later why this formulation is useful for parallel computations 11

Characterization of the solution Proposition (Care and Schultz) Optimal objective of Lagrangian dual is that of a partially convexified LP relaxation Theoretical equivalence with Lulli and Sens (2004) branch-and-price (column generation) However, solving the Lagrangian dual also gives a solution to (1) (useful in branch- and-bound) 12 The optimal value equals the optimal value of the linear program

Optimization of the Lagrangian Dual Each is concave and non-differentiable For a fixed, is a subgradient of, where By solving/evaluating (mixed-integer LP), we get at least one subgradient for free Care and Schultz suggests using proximal bundle black-box solvers. Alternatives are other variants of cutting-plane algorithms (boxstep or level regularization) 13

Lets open the proximal bundle black box Proximal bundle QP master - is the trial point, the regularization parameter The dual, below, is typically advantageous for computation The proximal bundle master QP has a (dual) block-angular structure 14

Dual of Proximal Bundle QP Master 15 Block-angular structure Direct result of equality-constrained formulation, doesn't hold for other formulations of non-anticipativity. But also applies to other forms of regularization.

Computational significance The master QP is overall sparse (but may be block dense). Dense QP solvers, which are typically used in black-box proximal bundle codes, can't efficiently solve it. Master QPs structure can be exploited: PIPS (Petra) or OOPS(Gondzio) With the ability to solve the master in parallel, we address a serial bottleneck of execution. Now, both MILP subproblems and QP master can be solved in parallel. Greater potential for parallel speedup (Amdahl's law) 16

Numerical experiments Implementation in C++ using MPI. Looking at solving Lagrangian dual, no branching implemented SCIP used for MILP subproblems. No compression of bundle (removing cuts/subgradients) Serial experiments Cutting plane vs. black-box proximal bundle vs. our implementation Parallel experiments Scalability of full proximal bundle algorithm More detailed look at scalability of parallel QP solver 17

Test instances Stochastic mixed-integer instances dcap and sslp from SIPLIB by Shabbir Ahmed Stochastic LP product instance from Huseyin Topaloglu 18

Test architecture Fusion high-performance cluster at Argonne 320 nodes InfiniBand QDR interconnect Two 2.6 Ghz Xeon processors per node (total 8 cores) Most nodes have 36 GB of RAM, some have 96 GB We use 1 MPI process (= parallel process) per core 19

Serial experiments OOQP - General sparsity-exploiting QP IPM solver PIPS - Specialized IPM solver for block-angular structure ConicBundle - Open-source off-the-shelf proximal bundle code 20 Time limit 7200 seconds

Parallel experiments serial master, parallel subproblems 21

Parallel experiments parallel master, parallel subproblems 22

Scalability of master QP Scope for parallelism in solving master QP depends on number of linking variables (= number of first-stage variables) being small relative to diagonal blocks (= subgradient cuts per scenario). Stochastic integer test problems have very small first stage (order of 10). Important to consider performance on larger first stage for practical problems. Stochastic LP product instances used for this (1,000 scenarios) We look only at QP solves, not proximal bundle convergence. 23

Small first stage 24

Medium first stage 25

Large first stage 26

Energy application stochastic unit commitment State of Illinois power grid 12-hour horizon 64 scenarios 3,621,180 vars. 3,744,468 cons. 3,132 binary LP Relaxation objective: 939,208 LP Relaxation + CglProbing cuts: 939,626 Feasible solution (rounding): 942,237 Optimality gap: 0.27% (0.5% is acceptable in industry practice) Lagrangian relaxation: 941,176 Feasible solution (rounding): 943,351 Optimality gap: 0.23% (combined: 0.11%) 27

Conclusions and future work Revisited dual decomposition from the perspective of parallel computation, addressed bottleneck of solving master Dual decomposition promising approach for parallel solution of stochastic mixed-integer programs More work needed to address load imbalance, perhaps asynchronism as in Linderoth and Wright Branch and bound implementation Large scale computational study for stochastic unit commitment 28

29 Supplier Consumer Distribution System Operator ON/OFF Generation Levels Demand Weather

Documents

On parallelizing dual decomposition in stochastic integer programming Cosmin Petra Mathematics and Computer Science Division Argonne National Laboratory,