15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 9 –...

Preview:

Citation preview

15-829A/18-849B/95-811A/19-729AInternet-Scale Sensor Systems: Design and Policy

Lecture 9 – Lookup Algorithms (DNS & DHTs)

Lecture #9 2 2-11-03

Overview

Lookup Overview

Hierarchical Lookup: DNS

Routed Lookups Chord CAN

Lecture #9 3 2-11-03

The Lookup Problem

Internet

N1

N2 N3

N6N5

N4

Publisher

Key=“title”Value=MP3 data… Client

Lookup(“title”)

?

Problem in DNS, P2P, routing, etc. Sensor networks: how to find sensor readings?

Lecture #9 4 2-11-03

Centralized Lookup (Napster)

sensor@

Client

Lookup(“whale”)

N6

N9 N7

DB

N8

N3

N2N1SetLoc(“whale”, N4)

Simple, but O(N) state and a single point of failure

Key=“whale”Value=image data…

N4

Lecture #9 5 2-11-03

Hierarchical Queries (DNS)

N4Publisher@

Client

N6

N9

N7N8

N3

N2N1

Scalable (log n), but could be brittle

Key=“whale”Value=image data…

Lookup(“whale”)

Root

Lecture #9 6 2-11-03

Flooded Queries (Gnutella, /etc/hosts)

N4Publisher@

Client

N6

N9

N7N8

N3

N2N1

Robust, but worst case O(N) messages per lookup

Key=“whale”Value=image data…

Lookup(“whale”)

Lecture #9 7 2-11-03

Routed Queries (DHTs)

N4Publisher

Client

N6

N9

N7N8

N3

N2N1

Lookup(“whale”)

Key=“whale”Value=image data…

Scalable (log n), but limited search styles

Lecture #9 8 2-11-03

Overview

Lookup Overview

Hierarchical Lookup: DNS

Routed Lookups Chord CAN

Lecture #9 9 2-11-03

Obvious Solutions (1)

Objective: provide name to address translationsWhy not centralize DNS? Single point of failure Traffic volume Distant centralized database Single point of update

Doesn’t scale!

Lecture #9 10 2-11-03

Obvious Solutions (2)

Why not use /etc/hosts? Original Name to Address Mapping

Flat namespace /etc/hosts SRI kept main copy Downloaded regularly

Count of hosts was increasing: machine per domain machine per user Many more downloads Many more updates

Lecture #9 11 2-11-03

Domain Name System Goals

Basically building a wide area distributed database Scalability Decentralized maintenance Robustness Global scope

Names mean the same thing everywhere

Don’t need Atomicity Strong consistency

Lecture #9 12 2-11-03

DNS Records

RR format: (class, name, value, type, ttl)

DB contains tuples called resource records (RRs) Classes = Internet (IN), Chaosnet (CH), etc. Each class defines value associated with type

FOR IN class: Type=A

name is hostname value is IP address

Type=NS name is domain (e.g. foo.com) value is name of authoritative name

server for this domain

Type=CNAME name is an alias name for some

“canonical” (the real) name value is canonical name

Type=MX value is hostname of mailserver

associated with name

Lecture #9 13 2-11-03

DNS Design: Hierarchy Definitions

root

edunetorg

ukcom

gwu ucb cmu bu mit

cs ece

cmcl

• Each node in hierarchy stores a list of names that end with same suffix

• Suffix = path up tree• E.g., given this tree, where

would following be stored:• Fred.com• Fred.edu• Fred.cmu.edu• Fred.cmcl.cs.cmu.edu• Fred.cs.mit.edu

Lecture #9 14 2-11-03

DNS Design: Zone Definitions

root

edunetorg

ukcomca

gwu ucb cmu bu mit

cs ece

cmcl Single node

Subtree

Complete Tree

• Zone = contiguous section of name space

• E.g., Complete tree, single node or subtree

• A zone has an associated set of name servers

Lecture #9 15 2-11-03

DNS Design: Cont.

Zones are created by convincing owner node to create/delegate a subzone Records within zone stored multiple redundant name servers Primary/master name server updated manually Secondary/redundant servers updated by zone transfer of

name space Zone transfer is a bulk transfer of the “configuration” of a DNS

server – uses TCP to ensure reliability

Example: CS.CMU.EDU created by CMU.EDU administrators

Lecture #9 16 2-11-03

Servers/Resolvers

Each host has a resolver Typically a library that applications can link to Local name servers hand-configured (e.g.

/etc/resolv.conf) Name servers

Either responsible for some zone or… Local servers

Do lookup of distant host names for local hosts Typically answer queries about local zone

Lecture #9 17 2-11-03

DNS: Root Name Servers

Responsible for “root” zone

Approx. dozen root name servers worldwide Currently {a-m}.root-

servers.net Local name servers

contact root servers when they cannot resolve a name Configured with well-

known root servers

Lecture #9 18 2-11-03

Lookup Methods

Recursive query: Server goes out and

searches for more info (recursive)

Only returns final answer or “not found”

Iterative query: Server responds with as

much as it knows (iterative) “I don’t know this name, but

ask this server”

Workload impact on choice? Local server typically does

recursive Root/distant server does

iterativerequesting hostsurf.eurecom.fr

gaia.cs.umass.edu

root name server

local name serverdns.eurecom.fr

1

2

34

5 6authoritative name server

dns.cs.umass.edu

intermediate name serverdns.umass.edu

7

8

iterated query

Lecture #9 19 2-11-03

Typical Resolution

ClientLocal

DNS server

root & edu DNS server

ns1.cmu.edu DNS server

www.cs.cmu.edu

NS ns1.cmu.eduwww.cs.cmu.edu

NS ns1.cs.cmu.edu

A www=IPaddr

ns1.cs.cmu.eduDNS

server

Lecture #9 21 2-11-03

Workload and Caching

What workload do you expect for different servers/names? Why might this be a problem? How can we solve this problem?

DNS responses are cached Quick response for repeated translations Other queries may reuse some parts of lookup

NS records for domains DNS negative queries are cached

Don’t have to repeat past mistakes E.g. misspellings, search strings in resolv.conf

Cached data periodically times out Lifetime (TTL) of data controlled by owner of data TTL passed with every record

Lecture #9 22 2-11-03

Prefetching

Name servers can add additional data to any response

Typically used for prefetching CNAME/MX/NS typically point to another host

name Responses include address of host referred to in

“additional section”

Lecture #9 23 2-11-03

Typical Resolution

ClientLocal

DNS server

root & edu DNS server

ns1.cmu.edu DNS server

www.cs.cmu.edu

NS ns1.cmu.eduwww.cs.cmu.edu

NS ns1.cs.cmu.edu

A www=IPaddr

ns1.cs.cmu.eduDNS

server

Lecture #9 24 2-11-03

Subsequent Lookup Example

ClientLocal

DNS server

root & edu DNS server

cmu.edu DNS server

cs.cmu.eduDNS

server

ftp.cs.cmu.edu

ftp=IPaddr

ftp.cs.cmu.edu

Lecture #9 25 2-11-03

Reliability

DNS servers are replicated Name service available if ≥ one replica is up Queries can be load balanced between replicas

UDP used for queries Need reliability must implement this on top of UDP! Why not just use TCP?

Try alternate servers on timeout Exponential backoff when retrying same server

Same identifier for all queries Don’t care which server responds

Is it still a brittle design?

Lecture #9 26 2-11-03

Overview

Lookup Overview

Hierarchical Lookup: DNS

Routed Lookups Chord CAN

Lecture #9 27 2-11-03

Routing: Structured Approaches

Goal: make sure that an item (file) identified is always found in a reasonable # of steps

Abstraction: a distributed hash-table (DHT) data structure insert(id, item); item = query(id); Note: item can be anything: a data object, document, file, pointer

to a file… Proposals

CAN (ICIR/Berkeley) Chord (MIT/Berkeley) Pastry (Rice) Tapestry (Berkeley)

Lecture #9 28 2-11-03

Routing: Chord

Associate to each node and item a unique id in an uni-dimensional space

Properties Routing table size O(log(N)) , where N is the total

number of nodesGuarantees that a file is found in O(log(N)) steps

Lecture #9 29 2-11-03

Aside: Consistent Hashing [Karger 97]

N32

N90

N105

K80

K20

K5

Circular 7-bitID space

Key 5Node 105

A key is stored at its successor: node with next higher ID

Lecture #9 30 2-11-03

Routing: Chord Basic Lookup

N32

N90

N105

N60

N10N120

K80

“Where is key 80?”

“N90 has K80”

Lecture #9 31 2-11-03

Routing: Finger table - Faster Lookups

N80

½¼

1/8

1/161/321/641/128

Lecture #9 32 2-11-03

Routing: Chord Summary

Assume identifier space is 0…2m

Each node maintainsFinger table

Entry i in the finger table of n is the first node that succeeds or equals n + 2i

Predecessor node An item identified by id is stored on the

successor node of id

Lecture #9 33 2-11-03

Routing: Chord Example

Assume an identifier space 0..8 Node n1:(1) joinsall entries in its

finger table are initialized to itself 01

2

34

5

6

7

i id+2i succ0 2 11 3 12 5 1

Succ. Table

Lecture #9 34 2-11-03

Routing: Chord Example

Node n2:(3) joins

01

2

34

5

6

7

i id+2i succ0 2 21 3 12 5 1

Succ. Table

i id+2i succ0 3 11 4 12 6 1

Succ. Table

Lecture #9 35 2-11-03

Routing: Chord Example

Nodes n3:(0), n4:(6) join

01

2

34

5

6

7 i id+2i succ0 2 21 3 62 5 6

Succ. Table

i id+2i succ0 3 61 4 62 6 6

Succ. Table

i id+2i succ0 1 11 2 22 4 0

Succ. Table

i id+2i succ0 7 01 0 02 2 2

Succ. Table

Lecture #9 36 2-11-03

Routing: Chord Examples

Nodes: n1:(1), n2(3), n3(0), n4(6) Items: f1:(7), f2:(2) 0

1

2

34

5

6

7 i id+2i succ0 2 21 3 62 5 6

Succ. Table

i id+2i succ0 3 61 4 62 6 6

Succ. Table

i id+2i succ0 1 11 2 22 4 0

Succ. Table

7

Items 1

Items

i id+2i succ0 7 01 0 02 2 2

Succ. Table

Lecture #9 37 2-11-03

Routing: Query

Upon receiving a query for item id, a node

Check whether stores the item locally

If not, forwards the query to the largest node in its successor table that does not exceed id

01

2

34

5

6

7 i id+2i succ0 2 21 3 62 5 6

Succ. Table

i id+2i succ0 3 61 4 62 6 6

Succ. Table

i id+2i succ0 1 11 2 22 4 0

Succ. Table

7

Items 1

Items

i id+2i succ0 7 01 0 02 2 2

Succ. Table

query(7)

Lecture #9 38 2-11-03

Overview

Lookup Overview

Hierarchical Lookup: DNS

Routed Lookups Chord CAN

Lecture #9 39 2-11-03

CAN

Associate to each node and item a unique id in an d-dimensional space Virtual Cartesian coordinate space

Entire space is partitioned amongst all the nodes Every node “owns” a zone in the overall space

Abstraction Can store data at “points” in the space Can route from one “point” to another

Point = node that owns the enclosing zone Properties

Routing table size O(d) Guarantees that a file is found in at most d*n1/d steps, where n

is the total number of nodes

Lecture #9 40 2-11-03

CAN E.g.: Two Dimensional Space

Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example:

Assume space size (8 x 8) Node n1:(1, 2) first node that joins cover the entire space

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1

Lecture #9 41 2-11-03

CAN E.g.: Two Dimensional Space

Node n2:(4, 2) joins space is divided between n1 and n2

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

Lecture #9 42 2-11-03

CAN E.g.: Two Dimensional Space

Node n2:(4, 2) joins space is divided between n1 and n2

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3

Lecture #9 43 2-11-03

CAN E.g.: Two Dimensional Space

Nodes n4:(5, 5) and n5:(6,6) join

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

Lecture #9 44 2-11-03

CAN E.g.: Two Dimensional Space

Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)

Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Lecture #9 45 2-11-03

CAN E.g.: Two Dimensional Space

Each item is stored by the node who owns its mapping in the space

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Lecture #9 46 2-11-03

CAN: Query Example

Each node knows its neighbors in the d-space Forward query to the neighbor that is closest

to the query id Example: assume n1 queries f4

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Lecture #9 47 2-11-03

Routing: Concerns

Each hop in a routing-based lookup can be expensive

No correlation between neighbors and their locationA query can repeatedly jump from Europe to North America,

though both the initiator and the node that store the item are in Europe!

Solutions: Tapestry takes care of this implicitly; CAN and Chord maintain multiple copies for each entry in their routing tables and choose the closest in terms of network distance

What type of lookups?Only exact match! no beatles.*

Lecture #9 48 2-11-03

Summary

The key challenge of building wide area P2P systems is a scalable and robust location service

Solutions covered in this lectureNaptser: centralized location serviceGnutella: broadcast-based decentralized location serviceFreenet: intelligent-routing decentralized solutionDHTs: structured intelligent-routing decentralized solution

Recommended