View
215
Download
0
Tags:
Embed Size (px)
Citation preview
1
A Framework for Lazy Replication in P2P VoD
Bin Cheng1, Lex Stein2, Hai Jin1, Zheng Zhang2
1 Huazhong University of Science & Technology (HUST)2 Microsoft Research Asia (MSRA)NOSSDAV 2008, Braunschweig, Germany, May 30, 2008
2
Background
VoD, popular Internet service-Youtube, Hulu
P2P, useful technology-File sharing, live streaming-BitTorrent, PPLive
GridCast with caching-36% decrease-43% departure misses
Replication in P2P VoD
Can P2P help VoD?-Feasibility-Performance improvement
5
Motivation -GridCast system overview
Hybrid architecture (client-server + P2P)― Tracker: indexes all joined peers― Source Server: stores a complete copy of every video― Peer: fetches chunks from source servers or other peers― Web Portal: provides the video catalog
tracker
Source ServerWeb portal
6
Motivation -trace collection
GridCast has been deployed on CERNET since May 2006― Network (CERNET)
• 1,500 Universities, 20 million hosts• Good bandwidth, 2 to 100Mbps to the desktop (core is complicated)
― Content• 2,000 videos• 48 minutes on average• 400 to 800Kbps, 610 Kbps on average
7
Motivation -trace analysis
Classify misses by their causesChunk X does not hit in the peer cache, Why?
New content― Never fetched by any peer
Peer departed― Fetched by some peers, but all of them are offline
Peer evicted― Fetched by an online peer, but evicted
Can not connect― Cached by some online peer that is not in the neighborhood
Insufficient bandwidth― Cached by some neighbor, but cannot retrieve it
5.3
27.6
15.6
11.3
4
new content peer departure peer eviction connection issue insuf. BW0
5
10
15
20
25
30
35
40
perc
enta
ge o
f al
l pla
yed
chun
ks (
%)
Departure misses become a big issue
43%
8
Motivation -challenges and chances
Replication Caching is not enough.Can we do better?
Challenges
Short user sessionsDepart at any time
Chances
Unused network resource72% (DOWN), 81% (UP)
Disk space37% available disk
10
Replication –fundamental tradeoff
Benefit:•Reduce departure misses•Reduce some eviction misses if the cache is not full
Cost:•Increase network traffic•Increase bandwidth misses •Increase some eviction misses if the cache is full
11
Replication -eager replication
x
x
neighborhood
A
BC
Replicate all missed chunks Use all of unused bandwidth
12
Replication -lazy replication
neighborhood
A
BC
Based on two predictors― Peer departure predictor― Chunk request predictor― Lazy-oracle and lazy-simple
Lazy factor― How much remained bandwidth can be used
Target peer selection― Random, Sequentially, File locality first
the increasing of chunk requests the increasing of online time
13
Replication -peer departure predictor
Based on the observation of online time-50% of user session, less than 10 minutes-the peer with higher online time is likely to stay longer
Simple departure predictor-online time <= 10 minutes, leave-online time > 10 minutes, stay
14
Replication -chunk request predictor
Chunks requested recently are more likely to be requested earlier in the near future
Simple chunk request predictor-use the chunk access history in the last several hours-give higher weight to the recent requests
t1234
futurehistory
now
34
8
6
popularity
68.5
41
31
21
1
41
231
421
816
r
15
Performance Evaluation -simulation setup
Trace-driven―1GB―Realized bandwidth―Last 1 hour history for chunk request predictor―10 minutes interval for peer departure predictor―Use the existing neighborhood
Metrics―Benefit: decrease of chunks served by the source servers―Cost: increase of chunks replicated between peers―Efficiency: Benefit / Cost
16
Performance Evaluation -exploring configurations
replication SS Load new content departure eviction connection bandwidth0
5000000
10000000
15000000
20000000
25000000
30000000
nu
mb
er
of c
hu
nks
randomly sequentially file locality first
File locality first achieves the best File locality first achieves the best performanceperformance
17
Performance Evaluation -lazy factor
-More chunks are delayed to be replicated when the peer leaves-Smaller lazy factor, more efficient
replication SS Load new content departure eviction connection bandwidth0
5000000
10000000
15000000
20000000
25000000
30000000
nu
mb
er
of c
hu
nks
a = 0.0 a = 0.1 a = 0.5 a = 1.0
Lower lazy factor is betterLower lazy factor is better
18
Performance Evaluation -comparison
Lazy-simple is close to lazy-oracle, in terms of benefits
Lazy-simple is better than eager, in terms of efficiency
Lazy-simple, 15% decrease of server load
replication SS Load new content departure eviction connection bandwidth0
1000000
2000000
3000000
4000000
5000000
nu
mb
er
of c
hu
nks
before replication eager replication (efficiency = 0.21) lazy-oracle (a=0.0) (efficiency = 0.78) lazy-simple (a=0.0) (efficiency = 0.33)
19
Conclusions
1
We identify that departure miss is a major issue for P2P VoD with caching
2
With two simple predictors, lazy replication can decrease server load by 15%
3
Lazy replication is more efficient than eager replication