Upload
hilary-colon
View
18
Download
0
Embed Size (px)
DESCRIPTION
Dynamic Replica Placement for Scalable Content Delivery. Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, kubitron}@CS.Berkeley.EDU EECS Department UC Berkeley. replica. cache. always update. adaptive coherence. CDN server. client. Motivation Scenario. data source. - PowerPoint PPT Presentation
Citation preview
Dynamic Replica Placement for Scalable Content Delivery
Yan Chen, Randy H. Katz,John D. Kubiatowicz
{yanchen, randy, kubitron}@CS.Berkeley.EDU
EECS DepartmentUC Berkeley
data plane
network plane
datasource
Web contentserver
CDN server
client
replica
always update
cache
Motivation Scenario
adaptivecoherence
Goal and Challenges
• Dynamic choice of number and location of replicas – Clients’ QoS constraints– Servers’ capacity constraints
• Efficient update dissemination– Delay– Bandwidth consumption
• Scalability: millions of objects, clients and servers
• No global network topology knowledge
Provide content distribution to clients with good Quality of Service (QoS) while retaining efficient and balanced resourceconsumption of the underlying infrastructure
Previous Work (Replica Placement)
• Focused on static replica placement– Clients’ distributions and access patterns known in
advance– Assume global IP network topology
• Data Location via DNS-redirection– Highly inefficient (this is a “hack”)– Centralized CDN name server cannot record replica
locations
Previous Work (Info Dissemination)
• No inter-domain IP multicast
• Application-level multicast (ALM) unscalable– Root maintains states for all children (Narada,
Overcast, ALMI, RMX)
– Root handles all “join” requests (Bayeux)
– Root split is common solution, but suffers consistency overhead
Solutions for Dissemination Tree
• Peer-to-Peer Overlay Location Services with Good Scalability & Locality
• Simultaneous Replica Placement and Tree Construction
Peer-to-peer Routing and Location Services
• Properties Needed by Tree Building Algorithms– Distributed, scalable location with guaranteed
success– Search with locality
• P2P Routing and Location Services: Tapestry– CAN, Chord, Pastry insufficient locality or
flexibility to place objects
• Http://www.cs.berkeley.edu/~ravenben/tapestry
Simultaneous Replica Placement and Tree Construction
• Static Replica Placement + IP Multicast– Modeled as a global optimization problem
– Design a greedy algorithm with logN approximation
– Optimal case for comparison
• Dynamic Replica Placement + Application-level Multicast– Search for qualified local replicas first
– Place new replicas on Tapestry overlay path
– Two approaches: naïve and smart
• Soft-state Tree Maintenance– Each node only maintains states for its parent and direct
children
parent candidate
data plane
network plane
c
s
Tapestry overlay path
Dynamic Replica Placement: naïve
proxy
Tapestry mesh
data plane
network plane
c
sproxy
Tapestry overlay path first placement choice
parent candidate
Dynamic Replica Placement: naïve
Tapestry mesh
Dynamic Replica Placement: smart
parent candidates
• Aggressive search
data plane
network plane
c
s parent
siblingserver child
proxy
Tapestry overlay path
client child
• Greedy load distribution
Dynamic Replica Placement: smart• Aggressive search • Lazy placement
• Greedy load distribution
data planeparent candidates
network plane
c
s parent
siblingserver child
proxy
Tapestry overlay path
client child
first placement choice
Evaluation Methodology
• Network Topology– 5000-node network with GT-ITM transit-stub model– 500 d-tree server nodes, 4500 clients join in random
order
• Dissemination Tree Server Deployment– Random d-tree– Backbone d-tree (choose backbone routers and subnet
gateways first)
• Constraints– 50 ms latency bound and 200 clients/server load bound
Four Approaches for Comparison
• Overlay Dynamic Naïve Placement (dynamic_naïve)
• Overlay Dynamic Smart Placement (dynamic_smart)
• Static Placement on Overlay Network (overlay_static)
• Static Placement on IP Network (IP_static)
Number of Replicas Deployed and Load Distribution
• Overlay_smart uses much less replicas than overlay_naïve and very close to IP_static• Overlay_smart has better load distribution than od_naïve, overlay_static and very close to IP_static
Multicast Performance
• 85% of overlay_smart Relative Delay Penalty (RDP) less than 4
• Bandwidth consumed by overlay_smart is very close to IP_static and much less than overlay_naive
Tree Construction TrafficIncluding “join” requests, “ping” messages, replica
placement and parent/child registration
• Overlay_smart consumes three to four times of traffic than overlay_naïve, and the traffic of overlay_naïve is quite close to IP_static• Far less frequent event than update dissemination
Conclusions• Peer-to-peer networks can be used to construct CDNs• Dissemination Tree: dynamic Content Distribution
Network with good QoS, efficiency and load balancing– P2P location service to improve scalability and locality– Simultaneous dynamic replica placement and tree construction
• In particular– Use Tapestry to contact nearby region of tree to select parent– Lazy placement of new replicas on Tapestry overlay path – Close to optimal number of replicas, good load distribution, low
multicast delay and bandwidth penalty at the price of reasonable construction traffic
Future Work
• Evaluate with more diverse topologies and real workload
• Dynamic replica deletion/migration to adapt to the shift of users’ interests
• Implementation for OceanStore, a global-scale persistent data storage system