# Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization

• Published on
30-Dec-2015

• View
29

4

Embed Size (px)

DESCRIPTION

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization. Chaitanya Swamy Caltech and U. Waterloo Joint work with David Shmoys Cornell University. Stochastic Optimization. Way of modeling uncertainty . - PowerPoint PPT Presentation

Transcript

• Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization

Chaitanya SwamyCaltech and U. WaterlooJoint work with David Shmoys Cornell University

• Stochastic OptimizationWay of modeling uncertainty. Exact data is unavailable or expensive data is uncertain, specified by a probability distribution.Want to make the best decisions given this uncertainty in the data.Applications in logistics, transportation models, financial instruments, network design, production planning, Dates back to 1950s and the work of Dantzig.

• Stochastic Recourse ModelsGiven:Probability distribution over inputs.Stage I:Make some advance decisions plan ahead or hedge against uncertainty.Uncertainty evolves through various stages.Learn new information in each stage.Can take recourse actions in each stage can augment earlier solution paying a recourse cost.Choose initial (stage I) decisions to minimize (stage I cost) + (expected recourse cost).

• 2-stage problem 2 decision points0.20.020.30.1stage Istage II scenarios

• 2-stage problem 2 decision pointsChoose stage I decisions to minimize expected total cost = (stage I cost) + Eall scenarios [cost of stages 2 k]. 0.20.020.30.1stage Istage II scenarios0.50.20.4stage Istage IIk-stage problem k decision points0.3scenarios in stage k

• Stochastic Set Cover (SSC)Universe U = {e1, , en }, subsets S1, S2, , Sm U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element.Stochastic version: Set of elements to be covered is given by a probability distribution.subset A U to be covered (scenario) is revealed after k stageschoose some sets initially stage Ican pick additional sets in each stage paying recourse cost. Minimize Expected Total cost =Escenarios AU [cost of sets picked for scenario A in stages 1, k].

• Stochastic Set Cover (SSC)Universe U = {e1, , en }, subsets S1, S2, , Sm U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element.Stochastic version: Set of elements to be covered is given by a probability distribution.How is the probability distribution on subsets specified?A short (polynomial) list of possible scenariosIndependent probabilities that each element existsA black box that can be sampled.

• Approximation AlgorithmHard to solve the problem exactly. Even special cases are #P-hard.Settle for approximate solutions. Give polytime algorithm that always finds near-optimal solutions.A is a a-approximation algorithm if,A runs in polynomial time.A(I) a.OPT(I) on all instances I, a is called the approximation ratio of A.

• Previous Models Considered2-stage problemspolynomial scenario model: Dye, Stougie & Tomasgard; Ravi & Sinha; Immorlica, Karger, Minkoff & Mirrokni.Immorlica et al.: also consider independent activation modelproportional costs: (stage II cost) = l(stage I cost), e.g., wSA = l.wS for each set S, in each scenario A.Gupta, Pl, Ravi & Sinha: black-box model but also with proportional costs.Shmoys, S (SS04): black-box model with arbitrary costs. gave an approximation scheme for 2-stage LPs + rounding procedure that reduces stochastic problems to their deterministic versions.

• Previous Models (contd.)Multi-stage problemsHayrapetyan, S & Tardos: 2k-approximation algorithm for k-stage Steiner tree.Gupta, Pl, Ravi & Sinha: also other k-stage problems. 2k-approximation algorithm for Steiner tree factors exponential in k for vertex cover, facility location.Both only consider proportional, scenario-dependent costs.

• Results from S, Shmoys 05Give the first fully polynomial approximation scheme (FPAS) for a large class of k-stage stochastic linear programs for any fixed k. black-box model: arbitrary distribution. no assumptions on costs. algorithm is the Sample Average Approximation (SAA) method. First proof that SAA works for (a class of) k-stage LPs with poly-bounded sample size.Shapiro 05: k-stage programs but with independent stages Kleywegt, Shapiro & Homem De-Mello 01: bounds for 2-stage programs Charikar, Chekuri & Pl 05: another proof that SAA works for (a class of) 2-stage programs.

• Results (contd.)FPAS + rounding technique of SS04 gives approximation algorithms for k-stage stochastic integer programs.no assumptions on distribution or costsimprove upon various results obtained in more restricted models: e.g., O(k)-approx. for k-stage vertex cover (VC) , facility location. Munagala has improved factor for k-stage VC to 2.

• A Linear Program for 2-stage SSCxS : 1 if set S is picked in stage IyA,S : 1 if S is picked in scenario AMinimize S wSxS + AU pA S WSyA,S s.t.S:eS xS + S:eS yA,S 1for each A U, eAxS, yA,S 0for each S, AExponentially many variables and constraints.stage II scenario A UpA : probability of scenario A U. Let costwSA= WS for each set S, scenario A.wS = stage I cost of set SEquivalent compact, convex program:Minimize h(x) = S wSxS + AU pAfA(x) s.t. 0 xS 1 for each SfA(x) = min {S WSyA,S :S:eS yA,S 1 S:eS xS for each eA yA,S 0 for each S}

• Sample Average ApproximationSample Average Approximation (SAA) method:Sample some N times from distributionEstimate pA by qA = frequency of occurrence of scenario A = nA/N.True problem: minxP (h(x) = w.x + AU pA fA(x))(P)Sample average problem:minxP (h'(x) = w.x + AU qA fA(x))(SA-P)Size of (SA-P) as an LP depends on N how large should N be?

• Sample Average ApproximationWanted result: With polynomial N, x solves (SA-P) h(x) OPT.Sample Average Approximation (SAA) method:Sample some N times from distributionEstimate pA by qA = frequency of occurrence of scenario A = nA/N.True problem: minxP (h(x) = w.x + AU pA fA(x))(P)Sample average problem:minxP (h'(x) = w.x + AU qA fA(x))(SA-P)Size of (SA-P) as an LP depends on N how large should N be?Possible approach: Try to show that h'(.) and h(.) take similar values.Problem: Rare scenarios can significantly influence value of h(.), but will almost never be sampled.Key insight: Rare scenarios do not much affect the optimal solution x* instead of function value, look at how function varies with x show that slopes of h'(.) and h(.) are close to each other

• Closeness-in-subgradientsTrue problem: minxP (h(x) = w.x + AU pA fA(x)) (P)Sample average problem:minxP (h'(x)= w.x + AU qA fA(x)) (SA-P)Slope subgradient Closeness-in-subgradients: At many points u in P, \$vector d'u s.t.(*) d'u is a subgradient of h'(.) at u, AND an e-subgradient of h(.) at u. True with high probability for h(.) and h'(.). Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves minxP g'(x) x is a near-optimal solution to minxP g(x).dm is a subgradient of h(.) at u, if "v, h(v) h(u) d.(vu).d is an e-subgradient of h(.) at u, if "vP, h(v) h(u) d.(vu) e.h(v) e.h(u).

• Closeness-in-subgradients algorithm will return x that is near-optimal for both problems.Closeness-in-subgradients: At many points u in P, \$vector d'u s.t.(*) d'u is a subgradient of h'(.) at u, AND an e-subgradient of h(.) at u. Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves minxP g'(x) x is a near-optimal solution to minxP g(x).Pug(x) g(u)duIntuition:subgradient determines minimizer of convex function.ellipsoid-based algorithm of SS04 for convex minimization only uses (e-) subgradients: uses (e-) subgradient to cut ellipsoid at a feasible point u in P(*) can run algorithm on both minxP g(x) and minxP g'(x) using same vector d'u at uP d is a subgradient of h(.) at u, if "v, h(v) h(u) d.(vu).d is an e-subgradient of h(.) at u, if "vP, h(v) h(u) d.(vu) e.h(v) e.h(u).

• Proof for 2-stage SSCTrue problem: minxP (h(x)= w.x + AU pA fA(x))(P)Sample average problem:minxP (h'(x)= w.x + AU qA fA(x))(SA-P)Let l = maxS WS /wS, zA optimal solution to dual of fA(x) at point x=u P.Facts from SS04:vector du = {du,S} with du,S = wS ApA eAS zA is subgradient of h(.) at u; can write du,S = E[XS] where XS = wS eAS zA in scenario AXS [WS, wS] Var[XS] WS2 for every set Sif d' = {d'S} is a vector such that |d'S du,S| e.wS for every set S then,d' is an e-subgradient of h(.) at u.A vector d'u with components d'u,S = wS AqA eAS zA = Eq[XS] is a subgradient of h'(.) at uB, C with poly(l2/e2.log(1/d)) samples, d'u is an e-subgradient of h(.) at u with probability 1 d polynomial samples ensure that with high probability, at many points uP, d'u is an e-subgradient of h(.) at uproperty (*)

• 3-stage SSCTrue distributionSampled distributionTrue distribution {pA,B} in TA is only estimated by distribution {qA,B} True and sample average problems solve different recourse problems for a given scenario ATrue problem: minxP (h(x) = w.x + A pA fA(x))(3-P)Sample avg. problem: minxP (h'(x) = w.x + A qA gA(x))(3SA-P)fA(x), gA(x) 2-stage set-cover problems specified by tree TA

• Proof sketch for 3-stage SSCTrue problem: minxP (h(x) = w.x + A pA fA(x))(3-P)Sample avg. problem:minxP (h'(x) = w.x + A qA gA(x))(3SA-P)Want to show that h(.) and h'(.) are close in subgradients.main difficulty: h(.) and h'(.) solve different recourse problemsSubgradient of h(.) at u is du ;du,S = wS A pA(dual soln. to fA(u))Subgradient of h'(.) at u is d'u ; d'u,S = wS A qA(dual soln. to gA(u)) To show d' is an e-subgradient of h(.) need that: (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) This is a Sample average theorem for the dual of a 2-stage problem!

• Proof sketch for 3-stage SSCTrue problem: minxP (h(x) = w.x + A pA fA(x))(3-P)Sample average problem:minxP (h'(x) = w.x + A qA gA(x)) (3SA-P)Subgradient of h(.) at u is du with du,S = wS A pA(dual soln. to fA(u))Subgradient of h'(.) at u is d'u with d'u,S = wS A qA(dual soln. to gA(u)) To show d'u is an e-subgradient of h(.) need that: (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) Idea: Show that the two dual objective fns. are close in subgradientsProblem: Cannot get closeness-in-subgradients by looking at standard exponential size LP-dual of fA(x), gA(x)formulate a new compact non-linear dual of polynomial size. (approximate) subgradient of dual objective function comes from(near-) optimal solution to a 2-stage primal LP: use earlier SAA result.Recursively apply this idea to solve k-stage stochastic LPs.

• Summary of ResultsGive the first approximation scheme to solve a broad class of k-stage stochastic linear programs for any fixed k.prove that Sample Average Approximation method works for our class of k-stage programs.Obtain approximation algorithms for k-stage stochastic integer problems no assumptions on costs or distribution.k.log n-approx. for k-stage set cover.O(k)-approx. for k-stage vertex cover, multicut on trees, uncapacitated facility location (FL), some other FL variants. (1+e)-approx. for multicommodity flow.Results generalize and/or improve previous results obtained in restricted k-stage models.

• Thank You.

A(I) = cost