# Boosted Sampling: Approximation Algorithms for Stochastic Problems

• Published on
23-Jan-2016

• View
26

<ul><li><p>Boosted Sampling: Approximation Algorithms for Stochastic ProblemsMartin Pl</p><p>Joint work withAnupam Gupta R. Ravi Amitabh Sinha</p><p>Boosted Sampling</p></li><li><p>Infrastructure Design ProblemsBuild a solution Sol of minimal cost, so that every user is satisfied.minimizecost(Sol)subject to satisfied(j,Sol)for j=1, 2, , n</p><p>For example, Steiner tree: Sol: set of edges to buildsatisfied(j,Sol) iff there is a path from terminal j to rootcost(Sol) = eSol ce</p><p>Boosted Sampling</p></li><li><p>Infrastructure Design ProblemsAssumption: Sol is a set of elementscost(Sol) = elemSol cost(elem)</p><p>Facility location: satisfied(j) iff j connected to an open facility Vertex Cover: satisfied(e={uv}) iff u or v in the coverSteiner network: satisfied(j) iff js terminals connected by a pathCut problems: satisfied(j) iff js terminals disconnected</p><p>Boosted Sampling</p></li><li><p>Dealing with uncertainityOften, we do not know the exact requirements of users.</p><p>Building in advance reduces cost but we do not have enough information.As time progresses, we gain more information about the demands but building under time pressure is costly.</p><p>Tradeoff between information and cost.</p><p>Boosted Sampling</p></li><li><p>The modelTwo stage stochastic model with recourse:</p><p>On Monday, elements are cheap, but we do not know how many/which clients will show up. We can buy some elements.</p><p>On Tuesday, clients show up. Elements are now more expensive (by an inflation factor ). We have to buy more elements to satisfy all clients.</p><p>Boosted Sampling</p></li><li><p>The modelTwo stage stochastic model with recourse:</p><p>Find Sol1 Elems and Sol2 : 2Users 2Elems to minimizecost(Sol1) + E(T)[cost(Sol2(T))]subject to satisfied(j, Sol1 Sol2(T))for all sets TUsers and all jT</p><p>Boosted Sampling</p></li><li><p>Related workStochastic linear programming dates back to works of Dantzig, Beale in the mid-50sOnly moderate progress on stochastic IP/MIPScheduling literature, various distributions of job lengths Single stage stochastic: maybecast [Karger&amp;Minkoff00], bursty connections [Kleinberg,Rabani&amp;Tardos00]Stochastic versions of NP-hard problems (restricted ) [Ravi &amp; Sinha 03], [Immorlica, Karger, Minkoff &amp; Mirrokni 04]Extensive literature on each deterministic problem</p><p>Boosted Sampling</p></li><li><p>Our workWe propose a simple but powerful framework to find approximate solutions to two stage stochastic problems using approximation algorithms for their deterministic counterparts.For a number of problems, including Steiner Tree, Facility Location, Single Sink Rent or Buy and Steiner Forest (weaker model) our framework gives constant approximation.</p><p>Analysis is based on strict cost sharing, developed by [Gupta,Kumar,P.&amp;Roughgarden03]</p><p>Boosted Sampling</p></li><li><p>No restriction on distributionsPrevious works often assume special distributions:Scenario model: There are k sets of users scenarios; each scenario Ti has probability pi. [Ravi &amp; Sinha 03]. Independent decisions model: each client j appears with prob. pj independently of others [Immorlica et al 04]. </p><p>In contrast, our scheme works for arbitrary distributions (although the independent coinflips model sometimes allows us to prove improved guarantees).</p><p>Boosted Sampling</p></li><li><p>The Framework1. Boosted Sampling: Draw samples of clients S1,S2 ,,S from the distribution .2. Build the first stage solution Sol1: use Alg to build a solution for clients S = S1S2 S.3. Actual set T of clients appears. To build second stage solution Sol2, use Alg to augment Sol1 to a feasible solution for T.Given an approx. algorithm Alg for a deterministic problem: </p><p>Boosted Sampling</p></li><li><p>Performance GuaranteeTheorem: Let P be a sub-additive problem, with -approximation algorithm, that admits -strict cost sharing. Stochastic(P) has (+) approx.Corollary: Stochastic Steiner Tree, Facility Location, Vertex Cover, Steiner Network (restricted model) have constant factor approximation algorithms.Corollary: Deterministic and stochastic Rent-or-Buy versions of these problems have constant approximations. </p><p>Boosted Sampling</p></li><li><p>First Stage CostRecall: We - sample S1,S2 ,,S from . - use Alg to build solution Sol1 feasible for S=i SiLemma: E[cost(Sol1)] Z*.Opt cost Z* = cost(Opt1) + E[cost(Opt2(T))].</p><p>Boosted Sampling</p></li><li><p>Second stage costAfter Stage 2, have a solution for S = S1 S T.Let Sol = Opt1 [ Opt2(S1) Opt2(S) Opt2(T)].E[cost(Sol)] cost(Opt1) + (+1) E[cost(Opt2(Si))] (+1)/ Z*.T is responsible for 1/(+1) part of Sol.If built in Stage 1, it would cost Z*/.Need to build it in Stage 2 pay Z*.Problem: do not T know when building a solution for S1 S.</p><p>Boosted Sampling</p></li><li><p>Idea: cost sharingScenario 1:Pretend to build a solution for S = S T.Charge each jS some amount (S,j).Scenario 2:Build a solution Alg(S) for S.Augment Alg(S) to a valid solution for S = S T.Assume: jS (S,j) Opt(S)We argued: E[jT (S,j)] Z*/ (by symmetry)Want to prove: Augmenting cost in Scenario 2 jT (S,j)</p><p>Boosted Sampling</p></li><li><p>Cost sharing functionInput: Instance of P and set of users SOutput: cost share (S,j) of each user jS</p><p>Example: Build a spanning tree on S root.Let (S,j) = cost of parental edge/2.Note: - 2 jS (S,j) = cost of MST(S) - jS (S,j) cost of Steiner(S)</p><p>Boosted Sampling</p></li><li><p>What properties of (,) do we need?(P1) Good approximation: cost(Alg(S)) Opt(S)(P2) Cost shares do not overpay: jS (S,j) cost(S)(P3) Strictness: For any S,TUsers:cost of Augment(Alg(S), T) jT (S T, j)Second stage cost = cost(Augment(Alg(i Si), T)) jT (j Sj T, j) E[jT (j Sj T, j)] Z*/ Hence, E[second stage cost] Z*/ = Z*.</p><p>Boosted Sampling</p></li><li><p>Strictness for Steiner TreeAlg(S) = Min-cost spanning tree MST(S)(S,j) = cost of parental edge/2 in MST(S)Augment(Alg(S), T): for all jT build its parental edge in MST(S T)Alg is a 2-approx for Steiner Tree is a 2-strict cost sharing function for Alg. Theorem: We have a 4-approx for Stochastic Steiner Tree.</p><p>Boosted Sampling</p></li><li><p>Vertex Cover83310945Users: edgesSolution: Set of vertices that covers all edgesEdge {uv} covered if at least one of u,v picked.Alg: Edges uniformly raise contributionsVertex can be paid for by neighboring edges freeze all edges adjacent to it. Buy the vertex.Edges may be paying for both endpoints 2-approximationNatural cost shares: (S, e) = contribution of e</p><p>Boosted Sampling</p></li><li><p>Strictness for Vertex Cover11111n+1n+1n11111</p><p>Boosted Sampling</p></li><li><p>Making Alg strictAlg: - Run Alg on the same input. - Buy all vertices that are at least 50% paid for.11111n+1n+1n11111 of each vertex paid for, each edge paying for two vertices still a 4-approximation.</p><p>Boosted Sampling</p></li><li><p>Why should strictness hold?Alg: - Run Alg on the same input. - Buy all vertices that are at least 50% paid for.Suppose vertex v fully paid for in Alg(S T).If jT j cost(v) , then T can pay for of v in the augmentation step.If jS j cost(v), then v would be open in Alg(S).(almost.. need to worry that Alg(S T) and Alg(S) behave differently.)12312Alg(S T)S = blue edgesT = red edgesv</p><p>Boosted Sampling</p></li><li><p>Metric facility locationInput: a set of facilities and a set of cities living in a metric space. Solution: Set of open facilities, a path from each city to an open facility.Off the shelf components: 3-approx. algorithm [Mettu&amp;Plaxton00]. Turns out that cost sharing fn [P.&amp;Tardos03] is 5.45 strict.Theorem: There is a 8.45-approx for stochastic FL.</p><p>Boosted Sampling</p></li><li><p>Steiner Networkclient j = pair of terminals sj, tjsatisfied(j): sj, tj connected by a path</p><p>2-approximation algorithms known ([Agarwal,Klein&amp;Ravi91], [Goemans&amp;Williamson95]), but do not admit strict cost sharing.</p><p>[Gupta,Kumar,P.,Roughgarden03]: 4-approx algorithm that admits 4-uni-strict cost sharing Theorem: 8-approx for Stochastic Steiner Network in the independent coinflips model.</p><p>Boosted Sampling</p></li><li><p>The Buy at Bulk problemclient j = pair of terminals sj, tjSolution: an sj, tj path for j=1,,ncost(e) = ce f(# paths using e)Rent or Buy: two pipesRent: $1 per pathBuy:$M, unlimited # of paths</p><p>Boosted Sampling</p></li><li><p>Special distributions: Rent or BuyStochastic Steiner Network:client j = pair of terminals sj, tjsatisfied(j): sj, tj connected by a path</p><p>Suppose.. ({j}) = 1/n (S) = 0 if |S|1Sol2({j}) is just a path!</p><p>Boosted Sampling</p></li><li><p>Rent or BuyThe trick works for any problem P. (can solve Rent-or-Buy Vertex Cover,..)These techniques give the best approximation for Single-Sink Rent-or-Buy (3.55 approx [Gupta,Kumar,Roughgarden03]), and Multicommodity Rent or Buy (8-approx [Gupta,Kumar,P.,Roughgarden03], 6.83-approx [Becchetti, Konemann, Leonardi,P.04]). Bootstrap to stochastic Rent-or-Buy: - 6 approximation for Stochastic Single-Sink RoB - 12 approx for Stochastic Multicommodity RoB (indep. coinflips)</p><p>Boosted Sampling</p></li><li><p>What if is also stochastic?Suppose is also a random variable.(S, ) joint distribution</p><p>For i=1, 2, , max do sample (Si, i) from with prob. i/max accept SiLet S be the union of accepted SisOutput Alg(S) as the first stage solution</p><p>Boosted Sampling</p></li><li><p>Multistage problemsThree stage stochastic Steiner Tree:On Monday, edges cost 1. We only know the probability distribution .On Tuesday, results of a market survey come in. We gain some information I, and update to the conditional distribution |I. Edges cost 1.On Wednesday, clients finally show up. Edges now cost 2 (2&gt;1), and we must buy enough to connect all clients. Theorem: There is a 6-approximation for three stage stochastic Steiner Tree (in general, 2k approximation for k stage problem)</p><p>Boosted Sampling</p></li><li><p>ConclusionsWe have seen a randomized algorithm for a stochastic problem: using sampling to solve problems involving randomness.Do we need strict cost sharing? Our proof requires strictness maybe there is a weaker property? Maybe we can prove guarantees for arbitrary subadditive problems?Prove strictness for Steiner Forest so far we have only uni-strictness.Cut problems: Can we say anything about Multicut? Single-source multicut?</p><p>Boosted Sampling</p></li><li><p>+++THE++END+++Note that if consists of a small number of scenarios, this can be transformed to a deterministic problem.</p><p>.</p><p>Boosted Sampling</p><p>Suppose you are a telecom company and plan to build a network. You need to start planning and building early to reduce the cost; however, not all information about the structure of the demand may be available. If you wait until the information becomes available, you will be able to make much better decisions, unfortunately your cost may go up significantly, because you will need to pay more for installing cables on a very short notice. First stage solution Sol1Recourse solution Sol2Stochastic problems are in general hard a popular approach is to assume that \pi consists of a small number (poly, 3) of possible scenarios.Huge amount of work on stochastic programming, about which, I have to admit, I know far less than I would like to.Heuristics, empirical resultsWorks that try to approximate the distributionStructural results (convexity) on the second stage costSuppose you have a problem for which you have an algorithm for solving (approximating)A natural thing to do would be to sample just once in step 1, to get an estimate of the distribution \Pi. However, this would not take into account that the second stage cost is \sigma times more expensive.</p><p>Well define Rent-or-Buy problems laterAdd picture.</p></li></ul>