Upload
kellie-wilkins
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
15-829A/18-849B/95-811A/19-729AInternet-Scale Sensor Systems: Design and Policy
Lecture 9 – Lookup Algorithms (DNS & DHTs)
Lecture #9 2 2-11-03
Overview
Lookup Overview
Hierarchical Lookup: DNS
Routed Lookups Chord CAN
Lecture #9 3 2-11-03
The Lookup Problem
Internet
N1
N2 N3
N6N5
N4
Publisher
Key=“title”Value=MP3 data… Client
Lookup(“title”)
?
Problem in DNS, P2P, routing, etc. Sensor networks: how to find sensor readings?
Lecture #9 4 2-11-03
Centralized Lookup (Napster)
sensor@
Client
Lookup(“whale”)
N6
N9 N7
DB
N8
N3
N2N1SetLoc(“whale”, N4)
Simple, but O(N) state and a single point of failure
Key=“whale”Value=image data…
N4
Lecture #9 5 2-11-03
Hierarchical Queries (DNS)
N4Publisher@
Client
N6
N9
N7N8
N3
N2N1
Scalable (log n), but could be brittle
Key=“whale”Value=image data…
Lookup(“whale”)
Root
Lecture #9 6 2-11-03
Flooded Queries (Gnutella, /etc/hosts)
N4Publisher@
Client
N6
N9
N7N8
N3
N2N1
Robust, but worst case O(N) messages per lookup
Key=“whale”Value=image data…
Lookup(“whale”)
Lecture #9 7 2-11-03
Routed Queries (DHTs)
N4Publisher
Client
N6
N9
N7N8
N3
N2N1
Lookup(“whale”)
Key=“whale”Value=image data…
Scalable (log n), but limited search styles
Lecture #9 8 2-11-03
Overview
Lookup Overview
Hierarchical Lookup: DNS
Routed Lookups Chord CAN
Lecture #9 9 2-11-03
Obvious Solutions (1)
Objective: provide name to address translationsWhy not centralize DNS? Single point of failure Traffic volume Distant centralized database Single point of update
Doesn’t scale!
Lecture #9 10 2-11-03
Obvious Solutions (2)
Why not use /etc/hosts? Original Name to Address Mapping
Flat namespace /etc/hosts SRI kept main copy Downloaded regularly
Count of hosts was increasing: machine per domain machine per user Many more downloads Many more updates
Lecture #9 11 2-11-03
Domain Name System Goals
Basically building a wide area distributed database Scalability Decentralized maintenance Robustness Global scope
Names mean the same thing everywhere
Don’t need Atomicity Strong consistency
Lecture #9 12 2-11-03
DNS Records
RR format: (class, name, value, type, ttl)
DB contains tuples called resource records (RRs) Classes = Internet (IN), Chaosnet (CH), etc. Each class defines value associated with type
FOR IN class: Type=A
name is hostname value is IP address
Type=NS name is domain (e.g. foo.com) value is name of authoritative name
server for this domain
Type=CNAME name is an alias name for some
“canonical” (the real) name value is canonical name
Type=MX value is hostname of mailserver
associated with name
Lecture #9 13 2-11-03
DNS Design: Hierarchy Definitions
root
edunetorg
ukcom
gwu ucb cmu bu mit
cs ece
cmcl
• Each node in hierarchy stores a list of names that end with same suffix
• Suffix = path up tree• E.g., given this tree, where
would following be stored:• Fred.com• Fred.edu• Fred.cmu.edu• Fred.cmcl.cs.cmu.edu• Fred.cs.mit.edu
Lecture #9 14 2-11-03
DNS Design: Zone Definitions
root
edunetorg
ukcomca
gwu ucb cmu bu mit
cs ece
cmcl Single node
Subtree
Complete Tree
• Zone = contiguous section of name space
• E.g., Complete tree, single node or subtree
• A zone has an associated set of name servers
Lecture #9 15 2-11-03
DNS Design: Cont.
Zones are created by convincing owner node to create/delegate a subzone Records within zone stored multiple redundant name servers Primary/master name server updated manually Secondary/redundant servers updated by zone transfer of
name space Zone transfer is a bulk transfer of the “configuration” of a DNS
server – uses TCP to ensure reliability
Example: CS.CMU.EDU created by CMU.EDU administrators
Lecture #9 16 2-11-03
Servers/Resolvers
Each host has a resolver Typically a library that applications can link to Local name servers hand-configured (e.g.
/etc/resolv.conf) Name servers
Either responsible for some zone or… Local servers
Do lookup of distant host names for local hosts Typically answer queries about local zone
Lecture #9 17 2-11-03
DNS: Root Name Servers
Responsible for “root” zone
Approx. dozen root name servers worldwide Currently {a-m}.root-
servers.net Local name servers
contact root servers when they cannot resolve a name Configured with well-
known root servers
Lecture #9 18 2-11-03
Lookup Methods
Recursive query: Server goes out and
searches for more info (recursive)
Only returns final answer or “not found”
Iterative query: Server responds with as
much as it knows (iterative) “I don’t know this name, but
ask this server”
Workload impact on choice? Local server typically does
recursive Root/distant server does
iterativerequesting hostsurf.eurecom.fr
gaia.cs.umass.edu
root name server
local name serverdns.eurecom.fr
1
2
34
5 6authoritative name server
dns.cs.umass.edu
intermediate name serverdns.umass.edu
7
8
iterated query
Lecture #9 19 2-11-03
Typical Resolution
ClientLocal
DNS server
root & edu DNS server
ns1.cmu.edu DNS server
www.cs.cmu.edu
NS ns1.cmu.eduwww.cs.cmu.edu
NS ns1.cs.cmu.edu
A www=IPaddr
ns1.cs.cmu.eduDNS
server
Lecture #9 21 2-11-03
Workload and Caching
What workload do you expect for different servers/names? Why might this be a problem? How can we solve this problem?
DNS responses are cached Quick response for repeated translations Other queries may reuse some parts of lookup
NS records for domains DNS negative queries are cached
Don’t have to repeat past mistakes E.g. misspellings, search strings in resolv.conf
Cached data periodically times out Lifetime (TTL) of data controlled by owner of data TTL passed with every record
Lecture #9 22 2-11-03
Prefetching
Name servers can add additional data to any response
Typically used for prefetching CNAME/MX/NS typically point to another host
name Responses include address of host referred to in
“additional section”
Lecture #9 23 2-11-03
Typical Resolution
ClientLocal
DNS server
root & edu DNS server
ns1.cmu.edu DNS server
www.cs.cmu.edu
NS ns1.cmu.eduwww.cs.cmu.edu
NS ns1.cs.cmu.edu
A www=IPaddr
ns1.cs.cmu.eduDNS
server
Lecture #9 24 2-11-03
Subsequent Lookup Example
ClientLocal
DNS server
root & edu DNS server
cmu.edu DNS server
cs.cmu.eduDNS
server
ftp.cs.cmu.edu
ftp=IPaddr
ftp.cs.cmu.edu
Lecture #9 25 2-11-03
Reliability
DNS servers are replicated Name service available if ≥ one replica is up Queries can be load balanced between replicas
UDP used for queries Need reliability must implement this on top of UDP! Why not just use TCP?
Try alternate servers on timeout Exponential backoff when retrying same server
Same identifier for all queries Don’t care which server responds
Is it still a brittle design?
Lecture #9 26 2-11-03
Overview
Lookup Overview
Hierarchical Lookup: DNS
Routed Lookups Chord CAN
Lecture #9 27 2-11-03
Routing: Structured Approaches
Goal: make sure that an item (file) identified is always found in a reasonable # of steps
Abstraction: a distributed hash-table (DHT) data structure insert(id, item); item = query(id); Note: item can be anything: a data object, document, file, pointer
to a file… Proposals
CAN (ICIR/Berkeley) Chord (MIT/Berkeley) Pastry (Rice) Tapestry (Berkeley)
Lecture #9 28 2-11-03
Routing: Chord
Associate to each node and item a unique id in an uni-dimensional space
Properties Routing table size O(log(N)) , where N is the total
number of nodesGuarantees that a file is found in O(log(N)) steps
Lecture #9 29 2-11-03
Aside: Consistent Hashing [Karger 97]
N32
N90
N105
K80
K20
K5
Circular 7-bitID space
Key 5Node 105
A key is stored at its successor: node with next higher ID
Lecture #9 30 2-11-03
Routing: Chord Basic Lookup
N32
N90
N105
N60
N10N120
K80
“Where is key 80?”
“N90 has K80”
Lecture #9 31 2-11-03
Routing: Finger table - Faster Lookups
N80
½¼
1/8
1/161/321/641/128
Lecture #9 32 2-11-03
Routing: Chord Summary
Assume identifier space is 0…2m
Each node maintainsFinger table
Entry i in the finger table of n is the first node that succeeds or equals n + 2i
Predecessor node An item identified by id is stored on the
successor node of id
Lecture #9 33 2-11-03
Routing: Chord Example
Assume an identifier space 0..8 Node n1:(1) joinsall entries in its
finger table are initialized to itself 01
2
34
5
6
7
i id+2i succ0 2 11 3 12 5 1
Succ. Table
Lecture #9 34 2-11-03
Routing: Chord Example
Node n2:(3) joins
01
2
34
5
6
7
i id+2i succ0 2 21 3 12 5 1
Succ. Table
i id+2i succ0 3 11 4 12 6 1
Succ. Table
Lecture #9 35 2-11-03
Routing: Chord Example
Nodes n3:(0), n4:(6) join
01
2
34
5
6
7 i id+2i succ0 2 21 3 62 5 6
Succ. Table
i id+2i succ0 3 61 4 62 6 6
Succ. Table
i id+2i succ0 1 11 2 22 4 0
Succ. Table
i id+2i succ0 7 01 0 02 2 2
Succ. Table
Lecture #9 36 2-11-03
Routing: Chord Examples
Nodes: n1:(1), n2(3), n3(0), n4(6) Items: f1:(7), f2:(2) 0
1
2
34
5
6
7 i id+2i succ0 2 21 3 62 5 6
Succ. Table
i id+2i succ0 3 61 4 62 6 6
Succ. Table
i id+2i succ0 1 11 2 22 4 0
Succ. Table
7
Items 1
Items
i id+2i succ0 7 01 0 02 2 2
Succ. Table
Lecture #9 37 2-11-03
Routing: Query
Upon receiving a query for item id, a node
Check whether stores the item locally
If not, forwards the query to the largest node in its successor table that does not exceed id
01
2
34
5
6
7 i id+2i succ0 2 21 3 62 5 6
Succ. Table
i id+2i succ0 3 61 4 62 6 6
Succ. Table
i id+2i succ0 1 11 2 22 4 0
Succ. Table
7
Items 1
Items
i id+2i succ0 7 01 0 02 2 2
Succ. Table
query(7)
Lecture #9 38 2-11-03
Overview
Lookup Overview
Hierarchical Lookup: DNS
Routed Lookups Chord CAN
Lecture #9 39 2-11-03
CAN
Associate to each node and item a unique id in an d-dimensional space Virtual Cartesian coordinate space
Entire space is partitioned amongst all the nodes Every node “owns” a zone in the overall space
Abstraction Can store data at “points” in the space Can route from one “point” to another
Point = node that owns the enclosing zone Properties
Routing table size O(d) Guarantees that a file is found in at most d*n1/d steps, where n
is the total number of nodes
Lecture #9 40 2-11-03
CAN E.g.: Two Dimensional Space
Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example:
Assume space size (8 x 8) Node n1:(1, 2) first node that joins cover the entire space
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1
Lecture #9 41 2-11-03
CAN E.g.: Two Dimensional Space
Node n2:(4, 2) joins space is divided between n1 and n2
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
Lecture #9 42 2-11-03
CAN E.g.: Two Dimensional Space
Node n2:(4, 2) joins space is divided between n1 and n2
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3
Lecture #9 43 2-11-03
CAN E.g.: Two Dimensional Space
Nodes n4:(5, 5) and n5:(6,6) join
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
Lecture #9 44 2-11-03
CAN E.g.: Two Dimensional Space
Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)
Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
Lecture #9 45 2-11-03
CAN E.g.: Two Dimensional Space
Each item is stored by the node who owns its mapping in the space
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
Lecture #9 46 2-11-03
CAN: Query Example
Each node knows its neighbors in the d-space Forward query to the neighbor that is closest
to the query id Example: assume n1 queries f4
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
Lecture #9 47 2-11-03
Routing: Concerns
Each hop in a routing-based lookup can be expensive
No correlation between neighbors and their locationA query can repeatedly jump from Europe to North America,
though both the initiator and the node that store the item are in Europe!
Solutions: Tapestry takes care of this implicitly; CAN and Chord maintain multiple copies for each entry in their routing tables and choose the closest in terms of network distance
What type of lookups?Only exact match! no beatles.*
Lecture #9 48 2-11-03
Summary
The key challenge of building wide area P2P systems is a scalable and robust location service
Solutions covered in this lectureNaptser: centralized location serviceGnutella: broadcast-based decentralized location serviceFreenet: intelligent-routing decentralized solutionDHTs: structured intelligent-routing decentralized solution