Upload
zariel
View
33
Download
1
Tags:
Embed Size (px)
DESCRIPTION
CoBlitz: A Scalable Large-file Transfer Service (COS 461). KyoungSoo Park Princeton University. Large-file Distribution. Increasing demand for large files Movies or software release On-line movie / downloads Linux distribution Files are 100MB ~ tens of GB One-to-many downloads - PowerPoint PPT Presentation
Citation preview
CoBlitz: A Scalable Large-file Transfer Service
(COS 461)
KyoungSoo ParkPrinceton University
KyoungSoo Park 2
Large-file Distribution• Increasing demand for large files• Movies or software release
• On-line movie/ downloads• Linux distribution
• Files are 100MB ~ tens of GB• One-to-many downloads
How to serve large files to many clients? • Content Distribution Network(CDN)?• Peer-to-peer system?
KyoungSoo Park 3
What CDNs Are Optimized For
Most Web files are small (1KB ~ 100KB)
KyoungSoo Park 4
Why Not Web CDNs?
• Whole file caching in participating proxy• Optimized for 10KB objects• 2GB = 200,000 x 10KB
• Memory pressure• Working sets do not fit in memory• Disk access is 1000 times slower
• Waste of resources• More servers needed• Provisioning is a must
KyoungSoo Park 5
Peer-to-Peer?
• BitTorrent takes up ~30% Internet BW
1. Download a “torrent” file
2. Contact the tracker
3. Enter the “swarm” network
4. Chunk exchange policy
- Rarest chunk first or random
- Tit-for-tat: incentive to upload
- Optimistic unchoking
5. Validate the checksums
torrenttracker
peers
updown
Benefit: extremely good use of resources!
KyoungSoo Park 6
Peer-to-Peer?
• Custom software• Deployment is a must• Configurations needed
• Companies may want managed service• Handles flash crowds• Handles long-lived objects
• Performance problem• Hard to guarantee the service quality• Others are discussed later
KyoungSoo Park 7
What We’d Like Is
Large-file service withNo custom clientNo custom serverNo prepositioningNo rehostingNo manual provisoning
KyoungSoo Park 8
CoBlitz: Scalable Large-file CDN
• Reducing the problem to small-file CDN• Split large-files into chunks• Distribute chunks at proxies• Aggregate memory/cache • HTTP needs no deployment
• Benefits• Faster than BitTorrent by 55-86% (~500%) • One copy from origin serves 43-55 nodes• Incremental build on existing CDNs
KyoungSoo Park 9
How It Works
Agent CDNClient
Only reverse proxy(CDN) caches the chunks!
CDN
CDNCDN
CDN ClientAgent
CDN
chunk1
chun
k 1
chunk 2
chunk 3
chunk 2
chunk 5
chunk 5
chunk 1
chunk 1
chunk 4 chunk 5 chunk 5
chun
k 4
chunk1 chunk2
chunk 3 chunk3
chunk5 chunk4
CDN = Redirector + Reverse ProxyDNS
coblitz.codeen.org
OriginServer
HTTP RANGE QUERY
KyoungSoo Park 10
Smart Agent
• Preserves HTTP semantics• Parallel chunk requests
Client
sliding window of “chunks”
donedone
done
HTTP
CDN
CDN
CDN
CDNno action
CDN
no actionno action
waitingwaitingwaiting
done
waitingdone
waitingwaiting
Agent
KyoungSoo Park 11
Chunk Indexing: Consistent Hashing
Static hashing f(x) = some_f(x) % n
But n is dynamic for servers - node can go down - new node can joinCDN node (proxy)
Problem: How to find the node responsible for a specific chunk?
Xk : Chunk request
X1
Consistent Hashing F(x) = some_F(x) % N (N is a large but fixed number)
Find a live node k, where|F(k) – F(URL) | is minimum
… N-1 0 …
X2
X3
KyoungSoo Park 12
Operation & Challenges
• Provides public service over 2.5 years• http://coblitz.codeen.org:3125/URL
• Challenges• Scalability & robustness• Peering set difference• Load to the origin server
KyoungSoo Park 13
Unilateral Peering
• Independent proximity-aware peering• Pick “n” close nodes around me• Cf. BitTorrent picks “n” nodes randomly
• Motivation• Partial network connectivity
• Internet2, CANARIE nodes• Routing disruption
• Isolated nodes
• Benefits• No synchronized maintenance problem• Improve both scalability & robustness
KyoungSoo Park 14
Peering Set Difference
• No perfect clustering by design• Assumption
• Close nodes shares common peers
Both can reach Only can reach Only can reach
KyoungSoo Park 15
Peering Set Difference
• Highly variable App-level RTTs• 10 x times variance than ICMP
• High rate of change in peer set
• Close nodes share less than 50%• Low cache hit• Low memory utility• Excessive load to the origin
KyoungSoo Park 16
Peering Set Difference
• How to fix?• Avg RTT min RTT• Increase # of samples• Increase # of peers• Hysteresis
• Close nodes share more than 90%
KyoungSoo Park 17
Reducing Origin Load
• Still have peering set difference• Critical in traffic to origin
• Proximity-based routing• Converge exponentially fast• 3-15% do one more hop• Implicit overlay tree
• Result• Origin load reduction by 5x
Origin server
Rerun hashing
KyoungSoo Park 18
Scale Experiments
• Use all live PlanetLab nodes as clients• 380~400 live nodes at any time• Simultaneous fetch of 50MB file
• Test scenarios• Direct• BitTorrent Total/Core• CoBlitz uncached/cached/staggered• Out-of-order numbers in paper
KyoungSoo Park 19
Throughput Distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2000 4000 6000 8000 10000
Throughput(Kbps)
Fra
cti
on
of
Nod
es <
= X
(C
DF)
Direct
BT - total
BT - core
In - order uncached
In - order staggered
In - order cached
55-86%Out-of-order staggered
BT-Core
KyoungSoo Park 20
Downloading Times
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Download Time (sec)
Fra
cti
on
of
Nod
es <
= X In-order cached
In-order staggered
In-order uncached
BT-core
BT-total
Direct
95% percentile: 1000+ secs faster
KyoungSoo Park 21
Why Is BitTorrent Slow?
• In the experiments• No locality – randomly choose peers• Chunk indexing – extra communication
• Trackerless BitTorrent – Kademlia DHT
• In practice• Upload capacity of typical peers is low
• 10 to a few 100 Kbps for cable/DSL users• Tit for tat may not be fair
• A few high-capacity uploaders help the most
• BitTyrant[NSDI’07]
KyoungSoo Park 22
Synchronized Workload Congestion
Origin Server
KyoungSoo Park 23
Addressing Congestion
• Proximity-based multi-hop routing• Overlay tree for each chunk
• Dynamic chunk-window resizing• Increase by 1/log(x), (where x is win
size) if chunk finishes < average• Decrease by 1 if retry kills the first
chunk
KyoungSoo Park 24
Number of Failures
4.3
5.7
2.1
0
1
2
3
4
5
6
Direct BitTorrent CoBlitz
Fai
lure
Per
cent
age(
%)
KyoungSoo Park 25
Performance After Flash Crowds
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5000 10000 15000 20000 25000 30000 35000
Throughput(Kbps)
Fra
cti
on o
f N
odes >
X
BitTorrent
In- order CoBlitz
BitTorrent: 20% > 5Mbps
CoBlitz:70+% > 5Mbps
KyoungSoo Park 26
Data Reuse
7 fetches for 400 nodes, 98% cache hit
7.7
35
55
0
10
20
30
40
50
60
Shark BitTorrent CoBlitz
Uti
lity
(# o
f n
odes
ser
ved
/ co
py)
KyoungSoo Park 27
Real-world Usage
• 1-2 Terabytes/day• Fedora Core official mirror
• US-East/West, England, Germany, Korea, Japan
• CiteSeer repository (50,000+ links)• University Channel (podcast/video)• Public lecture distribution by PU OIT• Popular game patch distribution• PlanetLab researchers
• Stork(U of Arizona) + ~10 others
KyoungSoo Park 28
Fedora Core 6 Release
• October 24th, 2006• Peak Throughput
1.44Gbps Release point 10am
1 G
Origin Server30-40Mbps
KyoungSoo Park 29
On Fedora Core Mirror List
• Many people complained about I/O• Performing peak 500Mbps out of
2Gbps• 2 Sun x4200 w/Dual Operons, 2G mem• 2.5 TB Sata-based SAN• All ISOs in disk cache or in-memoy FS
• CoBlitz uses 100MB mem per node• Many PL node disks are IDEs• Most nodes are BW capped at 10Mpbs
KyoungSoo Park 30
Conclusion
• Scalable large-file transfer service• Evolution under real traffic
• Up and running 24/7 for over 2.5 years• Unilateral peering, multi-hop routing,
window size adjustment• Better performance than P2P
• Better throughput, download time• Far less origin traffic
KyoungSoo Park 31
Thank you!
More information:http://codeen.cs.princeton.edu/
coblitz/
How to use:http://coblitz.codeen.org:3125/URL**Some content restrictions apply See Web site for details Contact me if you want full access!