26
1 Routing Outline Algorithms Scalability

1 Routing Outline Algorithms Scalability. 2 Overview Forwarding vs Routing –forwarding: to select an output port based on destination address and routing

Embed Size (px)

Citation preview

1

Routing

OutlineAlgorithms

Scalability

2

Overview• Forwarding vs Routing

– forwarding: to select an output port based on destination address and routing table

– routing: process by which routing table is built• Network as a Graph

• Problem: Find lowest cost path between two nodes• Factors

– static: topology– dynamic: load

4

3

6

21

9

1

1D

A

FE

B

C

3

Distance Vector• Each node maintains a set of triples

– (Destination, Cost, NextHop)

• Directly connected neighbors exchange updates– periodically (on the order of several seconds)– whenever table changes (called triggered update)

• Each update is a list of pairs:– (Destination, Cost)

• Update local table if receive a “better” route– smaller cost– came from next-hop

• Refresh existing routes; delete if they time out

4

Example

Destination Cost NextHop A 1 A C 1 C D 2 C E 2 A F 2 A G 3 A

D

G

A

F

E

B

C

Routing Table at B

5

Routing Loops• Example 1

– F detects that link to G has failed– F sets distance to G to infinity and sends update t o A– A sets distance to G to infinity since it uses F to reach G– A receives periodic update from C with 2-hop path to G– A sets distance to G to 3 and sends update to F– F decides it can reach G in 4 hops via A

• Example 2– link from A to E fails– A advertises distance of infinity to E– B and C advertise a distance of 2 to E– B decides it can reach E in 3 hops; advertises this to A– A decides it can read E in 4 hops; advertises this to C– C decides that it can reach E in 5 hops…

6

Loop-Breaking Heuristics

• Set infinity to 16– can not loop for ever

• Split horizon– B does not send update (D, h, A) to A, since it learned

from A.

– prevents two-node loops

• Split horizon with poison reverse– B sends update (D, inf., A) to A, since it learned from A.

– prevents two-node loops

• Not scale to large networks

7

Link State• Strategy

– send to all nodes (not just neighbors) information about directly connected links (not entire routing table)

• Link State Packet (LSP)– id of the node that created the LSP– cost of link to each directly connected neighbor– sequence number (SEQNO)– time-to-live (TTL) for this packet

8

Link State (cont)

• Reliable flooding– store most recent LSP from each node– forward LSP to all nodes but one that sent it– generate new LSP periodically

• increment SEQNO– start SEQNO at 0 when reboot– decrement TTL of each stored LSP

• discard when TTL=0

9

Route Calculation• Dijkstra’s shortest path algorithm• Let

– N denotes set of nodes in the graph– l (i, j) denotes non-negative cost (weight) for edge (i, j)– s denotes this node– M denotes the set of nodes incorporated so far (labeled set)– C(n) denotes cost of the path from s to node n

M = {s}for each n in N - {s}

C(n) = l(s, n)while (N != M)

M = M union {w} such that C(w) is the minimum for all w in (N - M)

for each n in (N - M)C(n) = MIN(C(n), C (w) + l(w, n ))

10

Metrics • Original ARPANET metric

– measures number of packets queued on each link– took neither latency or bandwidth into consideration

• New ARPANET metric– stamp each incoming packet with its arrival time (AT)– record departure time (DT)– when link-level ACK arrives, compute

Delay = (DT - AT) + Transmit + Latency

– if timeout, reset DT to departure time for retransmission – link cost = average delay over some time period

• Fine Tuning– compressed dynamic range– replaced Delay with link utilization

11

Routing Table at Routers

Forwarding table at router R1Subnet Number Subnet Mask Next Hop

128.96.34.0 255.255.255.128 interface 0

128.96.34.128 255.255.255.128 interface 1

128.96.33.0 255.255.255.0 R2

Subnet mask: 255.255.255.128Subnet number: 128.96.34.0

128.96.34.15128.96.34.1

H1 R1

128.96.34.130 Subnet mask: 255.255.255.128Subnet number: 128.96.34.128

128.96.34.129128.96.34.139

R2H2

128.96.33.1128.96.33.14

Subnet mask: 255.255.255.0Subnet number: 128.96.33.0

H3

12

Forwarding Algorithm

D = destination IP addressfor each entry (SubnetNum, SubnetMask, NextHop) D1 = SubnetMask & D if D1 = SubnetNum if NextHop is an interface deliver datagram directly to D else deliver datagram to NextHop

• Use a default router if nothing matches• Not necessary for all 1s in subnet mask to be contiguous • Can put multiple subnets on one physical network• Subnets not visible from the rest of the Internet

13

Internet StructureRecent Past

NSFNET backboneStanford

BARRNETregional

BerkeleyPARC

NCAR

UA

UNM

Westnetregional

UNL KU

ISU

MidNetregional■ ■ ■

14

Internet Structure

Today

Backbone service provider

Peeringpoint

Peeringpoint

Large corporation

Large corporation

Smallcorporation

“Consumer” ISP

“Consumer” ISP

“Consumer” ISP

15

How to Make Routing Scale

• Still Too Many Networks– routing tables do not scale

– route propagation protocols do not scale

16

CIDR: Classless Inter-Domain Routing

• CIDR (RFC 1519) assigns variable-sized addresses, without regard to classes to solve address shortage of IPv4. – IP address is accompanied by a network mask to

indicate the boundary. Usually written as: 128.131.0.0/22 (first IP address + number of bits in the network part

• Longest prefix match and address aggregation for scalable routing.

17

Longest Prefix Match and Address Aggregation

• A: 11000010 00011000 00000000 00000000 /21 host bits: 11• B: 11000010 00011000 00001000 00000000 /22 host bits: 10• C: 11000010 00011000 00001100 00000000 /22 host bits: 10• D: 11000010 00011000 00010000 00000000 /20 host bits: 12

• If a packet comes in with destination address: 11000010 00011000 00010001 00000100 (194.24.17.4), the only entry that produces a match is D.

• The above 4 entries can be further aggregated into 1 if the router has the same next hop for the 4 destinations, in the form of 194.24.0.0/19, or 11000010 00011000 00000000 00000000 /19.

18

How Routing Works in the Internet• Know a smarter router

– hosts know local router (default router)– local routers know site routers– site routers know core router– core routers know everything

• Autonomous System (AS)– corresponds to an administrative domain– examples: University, company, backbone network– assign each AS a 16-bit number

• Two-level route propagation hierarchy– interior gateway protocol (each AS selects its own)– exterior gateway protocol (Internet-wide standard)

19

Popular Interior Gateway Protocols

• RIP: Route Information Protocol– developed for XNS– distributed with Unix– distance-vector algorithm– based on hop-count

• OSPF: Open Shortest Path First– recent Internet standard– uses link-state algorithm– supports load balancing – supports authentication

20

EGP: Exterior Gateway Protocol• Overview

– designed for tree-structured Internet– concerned with reachability, not optimal routes

• Protocol messages– neighbor acquisition: one router requests that another

be its peer; peers exchange reachability information– neighbor reachability: one router periodically tests if

the another is still reachable; exchange HELLO/ACK messages; uses a k-out-of-n rule

– routing updates: peers periodically exchange their routing tables (distance-vector)

21

BGP-4: Border Gateway Protocol• AS Types

– stub AS: has a single connection to one other AS• carries local traffic only

– multihomed AS: has connections to more than one AS• refuses to carry transit traffic

– transit AS: has connections to more than one AS• carries both transit and local traffic

• Each AS has:– one or more border routers– one BGP speaker that advertises:

• local networks• other reachable networks (transit AS only)• gives path information

22

BGP Example• Speaker for AS2 advertises reachability to P and Q

– network 128.96, 192.4.153, 192.4.32, and 192.4.3, can be reached directly from AS2

• Speaker for backbone advertises– networks 128.96, 192.4.153, 192.4.32, and 192.4.3 can be reached along

the path (AS1, AS2).• Speaker can cancel previously advertised paths

Regional provider A(AS 2)

Regional provider B(AS 3)

Customer P(AS 4)

Customer Q(AS 5)

Customer R(AS 6)

Customer S(AS 7)

128.96192.4.153

192.4.32192.4.3

192.12.69

192.4.54192.4.23

Backbone network(AS 1)

23

IP Version 6• Features:

– Address is 16 byte long (IPv4 has 4 bytes).– Header is simplifies, having only 7 fields (IPv4 has 13).– Less used features are put in the option fields, which are made

easier to be processed.– Better support for security.– Better support for QoS.

• Header– 40-byte “base” header– extension headers (fixed order, mostly fixed length)

• fragmentation• source routing• authentication and security• other options

24

The Main IPv6 Header

25

The Main IPv6 Header• The version field is always 6.• The traffic class field indicates the QoS treatment required for the

packet.• The flow label field provides a mechanism to implement a virtual

circuit, which is uniquely identified by the tuple (source address, destination address, flow label). Virtual circuit makes providing QoS easier.

• The payload field indicates the number of bytes of the packet excluding the 40-byte fixed header.

• The next header field indicates which of the six extension headers (options) follows the fixed header, if none, indicates the upper layer protocol, e.g. TCP or UDP, to pass the data to.

• The hop limit field indicates the maximum hop the packet is allowed to go through, to prevent a packet looping for ever, similar to TTL in IPv4

• IPv4 header has fragmentation (option in IPv6), checksum, HLEN, etc.

26

IPv6 Address

• The address fields use 16-byte IPv6 addresses. The number of possible IPv6 addresses is 2^128 or 10^38. A new notation is used, i.e., an address is written as eight groups of four hexadecimal numbers, with colons between the groups, like this: 8000:0000:0000:0000:0123:4567:89AB:CDEF

• Since many zeros can appear in an address, three optimizations are made– Leading zeros are omitted, so 0123 becomes 123.– One or more groups of all zeros can be replaced by a pair of colons, so the

above address becomes 8000::123:4567:89AB:CDEF.– IP addresses can be written as a pair of colons and an old dotted decimal

number, e.g., ::192.31.20.46.