1/38
Randomized Weighted Paging
(Online Algorithms meet Linear Programming)
Nikhil Bansal
IBM Research
Niv Buchbinder
Technion,
Israel
Seffi Naor
Technion,
Israel
2/38
Caching / Paging
Browser cache
web
CPU cache
3/38
The Paging/Caching Problem
Set of pages {1,2,…,n} .
Cache can hold k << n pages.
Request sequence of pages 1, 6, 4, 1, 4, 7, 6, 1, 3, …
a) If requested page already in cache, no penalty.
b) Else, cache miss. Need to fetch page in cache
(possibly) evicting some other page.
Goal: Minimize the number of cache misses.
Main Question: Upon a request, which page to evacuate?
4/38
Measuring Algorithm Quality
Several natural page replacement algorithms: Least Recently used, Least Freq. Used, FIFO, …
How to measure their performance?
Statistical or Queueing Theoretic approach:
Assume requests generated by some statistical process.
Sampled from a probability distribution, Markov Chain, …
Analyze an algorithm’s performance under this distribution.
5/38
Competitive Analysis
No restriction on input sequence.
Competitive ratio (Alg): maxI Alg(I) / OPT(I)
OPT(I): The best possible offline solution for I
Example: Stock Market
Make decisions based on current knowledge,
But compare with best possible outcome in hindsight.
Can we do anything at all. Surprisingly, Yes.
Sleator Tarjan
(1985)
6/38
Example: Ski-Rental Problem
Problem: It costs $500 to buy skis, and $25 to rent. I don’t know how often I will ski. What strategy to use every time I go skiing?
Answer: Rent first 19 times, then buy the 20th time you go skiing. (20 = 500/25).
Never worse than twice off. Competitive ratio = 2.(In worst case, go 20 times, pay both renting and buying costs)
7/38
Randomized Competitive Analysis
Randomized Algorithm: Can toss coins and act accordingly
Expected Competitive Ratio : maxI E[Alg(I)] / OPT(I)
?
Game between adversary and algorithmBad instance
8/38
Example: Ski-Rental Problem
Problem: It costs $500 to buy skis, and $25 to rent. I don’t know how often I will ski. What strategy to use every time I go skiing?
Answer: pt: Probability of buying on day t.
p1 + p2 + … + pt ¼ (et/b-1 ) / (e-1)
Can achieve a ratio of e/(e-1) = 1.58 (no matter what adversary does, our expected cost within 1.58 times)
9/38
Outline
1) Competitive Analysis
2) Paging Problem: History
3) Linear Programming Framework
4) LP Duality and Proofs
5) Weighted Paging
6) Conclusions
10/38
Paging Problem
Set of pages {1,2,…,n} . Cache can hold k pages.
Request sequence of pages 1, 6, 4, 1, 4, 7, 6, 1, 3, …
a) If requested page already in cache, no penalty.
b) Else, cache miss. Need to fetch page in cache
(possibly) evicting some other page.
Goal: Minimize cache misses.
Historically, led to developments in competitive analysis.
11/38
Previous Results: Paging
Paging (Deterministic) [Sleator Tarjan 85]:
• Any det. algorithm >= k-competitive.
• LRU is k-competitive (also other algorithms)
• LRU is k/(k-h+1)-competitive if optimal has cache of size h · k.
Paging (Randomized):
• Rand. Marking O(log k) [Fiat, Karp, Luby, McGeoch, Sleator, Young 91].
• Lower bound Hk [Fiat et al. 91], tight results known.
• O(log(k/k-h+1))-competitive algorithm if optimal hascache of size h · k [Young 91]
12/38
The Weighted Paging Problem
One small change:
• Each page i has a different fetching cost w(i).• Models scenarios where cost of bringing pages is not
uniform:
Main memory, disk, internet …
Goal • Minimize the total cost of cache misses.
web
13/38
Weighted Paging (Previous Work)
Lower bound k
LRU k competitive
k/(k-h+1) if opt’s cache size h
k-competitive [Chrobak, Karloff, Payne, Vishwanathan 91]
k/(k-h+1) [Young 94]
O(log k) Randomized Marking
O(log k/(k-h+1))
O(log k) for two distinct weights [Irani 02]
No o(k) algorithm known for even 3 distinct weights.
Paging Weighted Paging
Det
erm
inis
tic
Ran
do
miz
ed
14/38
The k-server Problem
• k servers lie in an n-point metric space.
• Requests arrive at points.
• To serve request: Must move some server there.
Goal: Minimize total distance traveled.
• Paging = k-server on a uniform metric. (every page is a point, page in cache iff server on the point)
• Weighted paging = k-server on a weighted star metric.
Lower Bound: (log k) (widely believed right answer)
No < k competitive algorithms known, even for very simple spaces.
15/38
Our Results
Weighted Paging (Randomized):
• O(log k)-competitive algorithm for weighted paging.
• O(log (k/k-h+1))-competitive if opt’s cache size h<k.
Much simpler than previous approaches.
Based on linear programming approach pioneered by
Buchbinder and Naor.
A general technique to design randomized Algorithms.
16/38
Outline
1) Competitive Analysis
2) Paging Problem: History
3) Linear Programming Framework
4) LP Duality and Proofs
5) Weighted Paging
6) Conclusions
17/38
Linear Programming
Linear constraints, linear objective
Min x1 – 3 x2 + 2 x3
x1 – x2 + 7 x3 <= 2
x2 - x3 >= 0
x1 + 3 x2 + x3 >= -3
Polynomial time solvable. Lies at the heart of optimization.
18/38
An Abstract Online Problem
min 3 x1 + 5 x2 + x3 + 4 x4 + …
2 x1 + x3 + x6 + … ¸ 3x3 + x14 + x19 + … ¸ 8x2 + 7 x4 + x12 + … ¸ 2
Goal: Find feasible solution x* with min cost.
Requirements: 1) Upon arrival constraint must be satisfied2) Cannot decrease a variable (online nature)
Covering LP (non-negative entries)
19/38
Example
min x1 + x2 + … + xn
x1 + x2 + x3 + … + xn ¸ 1
x2 + x3 + … + xn ¸ 1
x3 + … + xn ¸ 1
…
xn ¸ 1
Online ¸ ln n (1+1/2+ 1/3+ … + 1/n)
Opt = 1 ( xn=1 suffices)
Set all xi to 1/n
Increase x2 ,x3,…,xn to 1/n-1
…
Increase xn to 1
20/38
Powerful Abstraction
The online covering LP problem (and its dual packing counterpart) is a powerful framework
Ski-Rental, Adword auctions, Dynamic TCP acknowledgement, Online Routing, Load Balancing, Congestion Minimization, Caching, Online Matching, Online Graph Covering, Parking Permit Problem, …
Unified Framework for various previously studied problems.Exposes underlying structure / gives improved guaranteesfor many problems.
But let first see how to model problems.
21/38
Ski Rental – Integer Program
Subject to:
For each day i:
1 - Rent on day i
0 - Don't rent on day i iz
1 - Buy
0 - Don't Buyx
1
mink
ii
Bx z
1ix z
, {0,1}ix z
(either buy or rent)
22/38
General Covering/Packing Results
For a {0,1} covering/packing matrix: [Buchbinder Naor 05]
– Competitive ratio O(log D) – Can get e/e-1 for ski rental and other problems.(D – max number of non-zero entries in a constraint).
Remarks:• Fractional solutions
• Number of constraints/variables can be exponential.
• There can be a tradeoff between the competitive ratio and the factor by which constraints are violated.
Fractional solution ! randomized algorithm (online rounding)
23/38
General Covering/Packing Results
For a general covering/packing matrix [BN05] :
Covering: – Competitive ratio O(log n) (n – number of variables).
Packing: – Competitive ratio O(log n + log [a(max)/a(min)])
a(max), a(min) – max/min non-zero entry
Remarks:• Results are tight.
• Can add “box” constraints to covering LP (e.g. x · 1)
24/38
Outline
1) Competitive Analysis
2) Paging Problem: History
3) Linear Programming Framework
4) LP Duality and Proofs
5) Weighted Paging
6) Conclusions
25/38
Duality
Min 3 x1 + 4 x2
x1 + x2 >= 3
x1 + 2 x2 >= 5
Want to convince someone thatthere is a solution of value 12.
Easy, just demonstrate a solution, x2 = 3
26/38
Duality
Min 3 x1 + 4 x2
x1 + x2 >= 3
x1 + 2 x2 >= 5
Want to convince someone thatthere is no solution of value 10.
How?
2 * first eqn + second eqn 3 x1 + 4 x2 >= 11
LP Duality Theorem: This seemingly ad hoc trick always works!
27/38
LP Duality
Min cj xj
j aij xj ¸ bi
So, for any y ¸ 0 satisfying i aij yi · cj for all ij xj cj ¸ i yi bi
Equality when Complementary Slacknessi.e. yi > 0 (only if corresponding primal constraint is tight) xi > 0 (only if corresponding dual constraint is tight)
i yi j aij xj ¸ i yi bi
j xj ( i aij yi ) ¸ i yi bi
Dual LPDual cost
Linear combination
(y ¸ 0)
28/38
Generic Primal-Dual Approach
min cx (primal) max b y (dual)
Ax ¸ b At y · c
x ¸ 0 y ¸ 0
Generic Primal Dual Algorithm:
0) Start with x=0, y=0 (primal infeasible, dual feasible)
1) Increase dual and primal together,
s.t. if dual cost increases by 1, primal increases by · c2) If both dual and primal feasible ) c approximate solution
29/38
Key Idea for Online Primal Dual
Primal: Min i ci xi Dual
Step t, new constraint: New variable yt
a1x1 + a2x2 + … + ajxj ¸ bt + bt yt in dual objective
How much: xi ? yt ! yt + 1 (additive update)
primal cost =
dx/dy proportional to x so, x varies as exp(y)
= Dual Cost
30/38
How to initialize
A problem: dx/dy is proportional to x, but x=0 initially.
So, x will remain equal to 0 ?
Answer: Initialize to 1/n.
When: Complementary slackness tells us that x > 0 only if dual constraint corresponding to x is tight.
Set x=1/n when its dual constraint becomes tight.
31/38
The Algorithm
Min j cj xj
j aij xj ¸ bi
On arrival of i-th constraint, Initialize yi=0 (dual var. for constraint)
If current constraint unsatisfied, gradually increase yi
If xj =0, set xj = 1/n when i aij yi = cj (dual tight)
else update xj exponentially as 1/n ¢ exp( (i aij yi / cj) - 1 )
Proof: 1) Primal Cost · Dual Cost 2) Dual solution violated by O(log n) factor.
32/38
Outline
1) Competitive Analysis
2) Paging Problem: History
3) Linear Programming Framework
4) LP Duality and Proofs
5) Weighted Paging
6) Conclusions
33/38
Fractional Weighted Paging
Model:
• Fractions of pages are kept in cache:
probability distribution over pages p1,…,pn
• The total sum of fractions of pages in the cache is at most k.
• If pi changes by , cost = w(i)
k units of cache
34/38
Weight Paging – Linear Program
Time line
Pg i Pg iPg i
At any time step t, can have at most k such intervals.
x(i,j): How much interval (i,j) evacuated thus far
If interval present, no cache miss.
Equivalently, at least n-k intervals must be absent
Cost = i w(i) j x(i,j)
B(t): number of distinct pages requested by time t
tPg i’ Pg i’Pg i’
0 · x(i,j) · 1 i : i pt
x(i,r(i,t)) ¸ n-k 8 t
(i,1)(i,2)
35/38
Direct application of Primal-Dual Framework gives
O(log n) competitive ratio.
Specialized tricks to obtain an O(log k) ratio
(details omitted)
Thm: This gives the fractional O(log k) competitive algorithm for weighted paging.
36/38
Corresponding Dual constraint
Generalizes Randomized Marking
( , )x i j
1/k
1
Dual is tight Dual violated by O(log k)
Page fully in memory (marked)
Page fully evacuated
Page is “unmarked”
0
37/38
Need for careful rounding
K=2, Pages A,B,C,D A,B have wt. 1, C,D have wt. M
LP state: (1/2,1/2,1/2,1/2) Cache state: ½ (A,B) + ½ (C,D)
New LP state: (1,0,1/2,1/2) Only consistent cache state: ½ (A,C) + ½ (A,D)
(need to move C or D, incur cost >> than LP cost)
Thm: Can maintain state s.t. only incur O(1) times LP cost.
Together with fractional O(log k) result, implies an O(log k)
competitive algorithm for weighted paging.
38/38
Concluding Remarks
Primal-dual gives simple unifying framework for caching.
(LP needed only for analysis, algorithm much simpler)
Can also give O(log2 k) compet. algorithm for general caching.
(Pages have both arbitrary sizes and weights)
Extend (partially) beyond covering/ packing problems. Apply to online learning with experts.
Future Directions:
1. O(log k) algorithms for k-server. (or even < k for a start ?)
2. Unified theory connecting online learning/prediction and competitive analysis.
39/38
Thank you