Chapter 4: Network Layer

4: Network Layer 4a-1

Chapter 4: Network LayerChapter goals: understand principles

behind network layer services: routing (path

selection) dealing with scale how a router works advanced topics: IPv6,

multicast instantiation and

implementation in the Internet

Chapter Overview: network layer services routing principle: path

selection hierarchical routing IP Internet routing protocols

reliable transfer intra-domain inter-domain

what’s inside a router? IPv6 multicast routing


Network layer functions transport packet from

sending to receiving hosts network layer protocols in

every host, routerthree important functions: path determination: route

taken by packets from source to dest. Routing algorithms

switching: move packets from router’s input to appropriate router output

call setup: some network architectures require router call setup along path before data flows

networkdata linkphysical








application

transportnetworkdata linkphysical

application



Network service modelQ: What service model

for “channel” transporting packets from sender to receiver?

guaranteed bandwidth? preservation of inter-

packet timing (no jitter)? loss-free delivery? in-order delivery? congestion feedback to

sender?

? ??virtual circuit

or datagram?

The most important abstraction provided

by network layer:

serv

ice a

bstra

ctio

n


Virtual circuits

call setup, teardown for each call before data can flow each packet carries VC identifier (not destination host OD) every router on source-destination path, s, maintains “state”

for each passing connection transport-layer connection only involved two end systems

link, router resources (bandwidth, buffers) may be allocated to VC to get circuit-like performance

“source-to-destination path behaves much like telephone circuit” performance-wise network actions along source-to-destination path


Virtual circuits: signaling protocols used to setup, maintain teardown VC used in ATM, frame-relay, X.25 not used in today’s Internet

application


application


1. Initiate call 2. incoming call3. Accept call4. Call connected

5. Data flow begins 6. Receive data


Datagram networks: the Internet model no call setup at network layer routers: no state about end-to-end connections

no network-level concept of “connection” packets typically routed using destination host ID

packets between same source-destination pair may take different paths

application


application


1. Send data 2. Receive data


Network layer service models:Network

Architecture

Internet

ATM

ATM

ATM

ATM

ServiceModel

best effort

CBR

VBR

ABR

UBR

Bandwidth

none

constantrateguaranteedrateguaranteed minimumnone

Loss

no

yes

yes

no

no

Order

no

yes

yes

yes

yes

Timing

no

yes

yes

no

no

Congestionfeedback

no (inferredvia loss)nocongestionnocongestionyes

no

Guarantees ?

Internet model being extented: Intserv, Diffserv Chapter 6


Datagram or VC network: why?Internet data exchange among

computers “elastic” service, no strict

timing req. “smart” end systems

(computers) can adapt, perform

control, error recovery simple inside network,

complexity at “edge” many link types

different characteristics uniform service difficult

ATM evolved from telephony human conversation:

strict timing, reliability requirements

need for guaranteed service

“dumb” end systems telephones complexity inside

network


Routing

Graph abstraction for routing algorithms:

graph nodes are routers

graph edges are physical links link cost: delay, $

cost, or congestion level

Goal: determine “good” path

(sequence of routers) thru network from source to

dest.

Routing protocol

A

ED

CB

F2

21 3

1

12

53

5

“good” path: typically means

minimum cost path other def’s possible


Routing IssuesKey question: how are routing tables determined /

updated? who determines table entries?

what info used in determining table entries?

when do routing table entries change?

where is routing info stored?

how to control table size?

why are routing tables determined a particular way. What is the theoretical basis?

Answer these and we are done!


Routing Issues… (more)

scalability: must be able to support large numbers of hosts, routers, networks

adapt to changes in topology or significant changes in traffic, quickly and efficiently self-healing: little or not human intervention

route selection may depend on different criteria

performance: "choose route with smallest delay"

policy: "choose a route that doesn't cross a government network" (equivalently: "let no non-government traffic cross this network")


Classification of Routing Algorithms

Centralized versus decentralized

centralized: central site computes and distributed routes (equivalently: information for computing routes known globally, each router makes same computation)

decentralized: each router sees only local information (itself and physically-connected neighbors) and computes routes on this basis

pros and cons?


Classification

Static versus adaptive static: routing tables change very slowly, often

in response to human intervention dynamic: routing tables change as network

traffic or topology change pros and cons ?

Two basic approaches adopted in practice: link-state routing: centralized, dynamic (periodically

run) distance vector: distributed, dynamic (in direct

response to changes)


Routing Algorithm classificationGlobal or decentralized

information?Global: all routers have complete

topology, link cost info “link state” algorithmsDecentralized: router knows physically-

connected neighbors, link costs to neighbors

iterative process of computation, exchange of info with neighbors

“distance vector” algorithms

Static or dynamic?Static: routes change slowly

over timeDynamic: routes change more

quickly periodic update in response to link

cost changes


A Link-State Routing AlgorithmDijkstra’s algorithm net topology, link costs

known to all nodes accomplished via “link

state broadcast” all nodes have same

info computes least cost paths

from one node (‘source”) to all other nodes gives routing table for

that node iterative: after k iterations,

know least cost path to k dest.’s

Not necessarily the fewest hops !

Notation: c(i,j): link cost from node

i to j. cost infinite if not direct neighbors

D(v): current value of cost of path from source to dest. V

p(v): predecessor node along path from source to v, that is next v

N: set of nodes whose least cost path definitively known


Link-State Routing

flooding : router sends out information on all of its ports; its neighbors then send out information on all of their ports, etc…. Eventually every router ( no exceptions) receives a copy of the same information

Link State : each router periodically shares its knowledge of its neighborhood with all routers in the network


Link-state routing

Link State Packet: When a router floods the network with information about its neighborhood, it is said to be advertising. The basis of this advertising is a short packet called a link state packet (LSP), usually containing 4 fields:

ID of advertiser,ID of destination network affected,the cost, the ID of the neighbor router


Link-State Routing AlgorithmsHow get information about neighbors ?

Periodically sends them a short greeting packet responds : assume ok and functioning well no response: assume a change and alerts

entire network in its next LSP greeting pkt : small and use negligible

resources ( unlike DV routing tables)Initialization

Each router comes up and sends greeting packet to its neighbors; prepares a LSP and floods the network


Link State Database: Every router receives every LSP and puts the information

into a LS database• => every router builds the same database ( as all

receive the same info)

To calculate the routing information, each router applies an algorithm( Dijkstra’s Algorithm) to calculate the shortest path between two points on a network of arcs and nodes

• nodes = networks or routers• arcs = connections between router and network with

costs applied only to the router==> network arc

• network => router are always 0 cost


Dijsktra’s Algorithm1 Initialization: 2 N = {A} 3 for all nodes v 4 if v adjacent to A 5 then D(v) = c(A,v) 6 else D(v) = infty 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N: 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N


Dijkstra’s Algorithm1. Begins to build the shortest path tree by identifying its

root - then attaching all nodes that can be reached from the root ( all the neighbors) -- nodes and arcs are temporary at this stage

2. Compares temporary arcs and identifies the one with lowest cumulative cost AND makes this arc and node a permanent connection

3. Examines the database and identifies every node that can be reached from its newly chosen node

4. Repeat steps 2 and 3 until every node in the network has become a permanent part of the tree -- the only permanent arcs will be those representing the shortest route to every node.

Routing Table Each router now uses the shortest path tree to construct its routing table - which are both different for each router


Dijkstra’s algorithm: exampleStep

012345

start NA

ADADE

ADEBADEBC

ADEBCF

D(B),p(B)2,A2,A2,A

D(C),p(C)5,A4,D3,E3,E

D(D),p(D)1,A

D(E),p(E)infinity

2,D

D(F),p(F)infinityinfinity

4,E4,E4,E

A

ED

CB

F2

21 3

1

12

53

5


Dijkstra’s algorithm, discussionAlgorithm complexity: n nodes each iteration: need to check all nodes, w, not in N n*(n+1)/2 comparisons: O(n**2) more efficient implementations possible: O(nlogn)Oscillations possible: e.g., link cost = amount of carried traffic

AD

CB

1 1+e

e0

e1 1

0 0A

DC

B2+e 0

001+e1

AD

CB

0 2+e

1+e10 0

AD

CB

2+e 0

e01+e1

initially … recomputerouting

… recompute … recompute


Routing Protocol Requirements

Minimize routing table space: keep as small as possible – reduces overhead involved with sharing information

Minimize control messages

Robust : protect against corruption of tables; periodically run consistency checks, add sequence numbers

Use optimal paths


Distance Vector Routing Algorithm

Telephone network routing can take advantage of predictable traffic flow and a relatively small network core

Large packet networks are a very different environment Links and routers are unreliable Alternative paths may well be scarce Traffic patterns change unpredictably within minutes


Routing algorithms

Link state and distance vector algorithms both assume a router knows: address of each neighbor the cost of reaching each neighbor

Both allow a router to find / discover global routing information, i.e. the next hop to reach every destination in the network by the shortest path by exchanging routing information with only its neighbors


Routing algorithms

Distance Vector algorithm: A node tells its neighbors its distance to

every other node in the network

Link State algorithm: A node tells every other node in the network

its distance to neighbors

Both are distributed and suitable for large internetworks controlled by multiple administrative entities


Distance Vector Routing Algorithm

Assume each router knows the identity of every other router in the network – but not necessarily the shortest path to it.

Each router maintains a distance vector = list of < destination , cost > tuples , one per destination

• Cost = sum o f the link costs on the shortest path to that destination

• Initially , cost set to a value higher than expected cost of any route in the network ( hence to infinity )


Distance Vector Routing Algorithm Router periodically sends copy of its distance

vector to all of its neighbors

When a distance vector is received, router determines whether its costs to any destination would decrease if it routed packets to that destination through that neighbor

With this asynchronous updating, eventually the routing tables converge Info. Spread one hop at a time … method due to

Bellman and Ford


Distance Vector Routing Algorithmiterative: continues until no

nodes exchange info. self-terminating :

no “signal” to stopasynchronous: nodes need not

exchange info/iterate in lock step!

distributed: each node

communicates only with directly-attached neighbors

Distance Table data structure each node has its own row for each possible

destination column for each directly-

attached neighbor to node example: in node X, for

dest. Y via neighbor Z:

D (Y,Z)X

distance from X toY, via Z as next hop

c(X,Z) + min {D (Y,w)}Zw

=

=


Distance Vector Routing: overviewIterative, asynchronous:

each local iteration caused by:

local link cost change message from neighbor:

its least cost path change from neighbor

Distributed: each node notifies

neighbors only when its least cost path to any destination changes neighbors then notify their

neighbors if necessary

wait for (change in local link cost of msg from neighbor)

recompute distance table

if least cost path to any dest has changed, notify neighbors

Each node:


Distance Vector Routing Each router shares its knowledge about the entire network

by sending all of its collected knowledge to its neighbors. At the outset, this may be very sparse but it is all sent

Each router periodically sends its information only to those routers to which it has a direct connection - this info. is used by neighbors to update their own information

Roughly, every 30 seconds each router sends its information to its neighbors - whether or not network has changed since last exchange recall Link State updates ~every 30 minutes

Distance Vector : each router periodically shares its knowledge of the entire network with each of its neighbors


Distance Vector Routing Packet Cost: Both LS and DV are lowest cost algorithms

DV : cost refers to hop count LS : cost refers a weighted value based on number of

factors - security levels, traffic, state of the link,… costs may well be different for the same 2 networks In determining the route, cost of a hop is applied to each

packet as it leaves a router and enters a network ( outbound cost)

costs are applied only to routers and not any other station on a network ( link from one router to next is a network NOT a point to point connection

costs are applied as packet leaves NOT as it enters a router

most networks are broadcast networks- when a packet enters a network, every station,including the router can pick it up - so don’t assign cost of going from network to router


Net:14

A

B

C

DE

F

Net:14

Net:78

Net:92

Net:23

Net:08

Net:66

Net:55


Routing Table At startup , router’s knowledge is sparse - knows it is connected to some

number of LANs and knows ID of each station. In most systems, a station port ID and network ID share the same prefix

router can discover to which nets it is connected by examining its own logical addresses (has as many such addresses as it has connected ports)

Routing Table : at least 3 pieces of info :Network ID, cost, ID of next router– Network ID = final destination of the packet– cost = # hops packet must take to get there– Next router = the router to which a packet is to be delivered on its way to

destination so for the diagram: initial table at A is

» 14 1 -» 23 1 -» 78 1 -

• only dest. at startup are those attached directly ==> blank in 3rd column • no multihop destinations identified , yet no next routers

These basic tables are sent out on network


Routing Table When A receives info. table from B , it sees that B can get to nets 55 and 14; A knows B

is its neighbor so if A adds 1 more hop to all costs shown in B table, the sum will be cost to A of getting to these other networks

A old table combined A - new table

1 782 551 231 14

1 782 551 232 141 14

1 781 231 14

BB

B

BB

hopone255214

155114

Rec’d from B


Distance table: per-node table recording cost to all other nodes via

each of its neighbors

DE(A,B) gives currently computed minimum cost from E to A given that first node on path is B. DE(A,B) = c(E,B) + min DB(A,*) min DE(A,*) gives E's minimum cost to A routing table derived from distance table

example: DE(A,B) = 14 (note: not 15!) example: DE(C,D) = 4, DE(C,A) = 6

B C

A

E D

71

12

28

DE() A B D A 1 14 5 B 7 8 5 C 6 9 4 D 4 11 2


Distance vector algorithm based on Bellman-Ford algorithm used in many routing protocols: Internet BGP,

ISO IDRP, Novell IPX, original ARPAnet

Algorithm (at node X):

Initialization: for all adjacent nodes v: D(*,v) = infinity D(v,v) = c(X,v)

send shortest path cost to each destination to neighbors Loop:

execute distributed topology update algorithm forever

This is the hardest component


1. wait (until I see a link cost change to neighbor Y or until I receive an update from neighbor W)

2. if (c(X,Y) changes by delta) { /* change my cost to my neighbor Y */ change all column-Y entries in distance table by delta if this changes my least cost path to Z send update wrt Z, DX(Z,*) , to all neighbors } 3. if (update received from W wrt Z) { /* shortest path from W to some Z has changed */ DX(Z,W) = c(X,W) + DW(Z,*) } if this changes my least cost path to Z send update wrt Z, DX(Z,*) , to all neighbors }

Update Algorithm at Node X:


Distance Table: example

A

E D

CB78

12

1

2D ()

A

B

C

D

A

1

7

6

4

B

14

8

9

11

D

5

5

4

2

Ecost to destination via

dest

inat

ion

D (C,D)E

c(E,D) + min {D (C,w)}Dw=

= 2+2 = 4

D (A,D)E

c(E,D) + min {D (A,w)}Dw=

= 2+3 = 5

D (A,B)E

c(E,B) + min {D (A,w)}Bw=

= 8+6 = 14

loop!

loop!


Distance table gives routing table

D ()

A

B

C

D

A

1

7

6

4

B

14

8

9

11

D

5

5

4

2

Ecost to destination via

dest

inat

ion

A

B

C

D

A,1

D,5

D,4

D,4

Outgoing link to use, cost

dest

inat

ion

Distance table Routing table


Distance table: per-node table recording cost to all other nodes via

each of its neighbors DE(A,B) gives currently computed minimum

cost from E to A given that first node on path is B. DE(A,B) = c(E,B) + min DB(A,*) min DE(A,*) gives E's minimum cost to A routing table derived from distance table

example: DE(A,B) = 14 (note: not 15!) example: DE(C,D) = 4, DE(C,A) = 6

B C

A

E D

7

1

12

28

DDEE() A B D() A B D A A 11 14 5 14 5 B 7 8 B 7 8 55 C 6 9 C 6 9 44 D 4 11 D 4 11 22


Distance Vector Routing: overviewIterative, asynchronous:

each local iteration caused by:

local link cost change message from neighbor:

its least cost path change from neighbor

Distributed: each node notifies

neighbors only when its least cost path to any destination changes neighbors then notify their

neighbors if necessary

wait for (change in local link cost of msg from neighbor)

recompute distance table

if least cost path to any dest has changed, notify neighbors

Each node:


Distance Vector Algorithm:

1 Initialization: 2 for all adjacent nodes v: 3 D (*,v) = infty /* the * operator means "for all

rows" */ 4 D (v,v) = c(X,v) 5 for all destinations, y 6 send min D (y,w) to each neighbor /* w over all X's

neighbors */

XX

Xw

At all nodes, X:


Distance Vector Algorithm (cont.):8 loop 9 wait (until I see a link cost change to neighbor V 10 or until I receive update from neighbor V) 11 12 if (c(X,V) changes by d) 13 /* change cost to all dest's via neighbor v by d */ 14 /* note: d could be positive or negative */ 15 for all destinations y: D (y,V) = D (y,V) + d 16 17 else if (update received from V wrt destination Y) 18 /* shortest path from V to some Y has changed */ 19 /* V has sent a new value for its min DV(Y,w) */ 20 /* call this received new value is "newval" */ 21 for the single destination y: D (Y,V) = c(X,V) + newval 22 23 if we have a new min D (Y,w)for any destination Y 24 send new value of min D (Y,w) to all neighbors 25 26 forever

w

XX

XX

X

ww


Distance Vector Algorithm: example

X Z12

7

Y

D (Y,Z)X c(X,Z) + min {D (Y,w)}w=

= 7+1 = 8

Z

D (Z,Y)X c(X,Y) + min {D (Z,w)}w=

= 2+1 = 3

Y


Distance Vector Algorithm: example

X Z12

1

Y

7


Distance Vector: link cost changesLink cost changes: node detects local link cost

change updates distance table (line 15) if cost change in least cost path,

notify neighbors (lines 23,24)

X Z14

50

Y1

algorithmterminates“good

news travelsfast”


Distance Vector: link cost changes

Link cost changes: good news travels fast bad news travels slow -

“count to infinity” problem! X Z14

50

Y60

algorithmcontinues

on!

Y Y Y


Topology Changes suppose that A becomes aware that F has failed.

How is this possible? The mechanism used in RIP is the following:

Every router is supposed to send an update vector to its neighbors every 30 seconds.

If a node K has an entry in its routing table for network i with the next hop to router N, and if K receives no updates from N within 180 seconds, it marks the route invalid.

The assumption is made that either router N has crashed or that the network connecting K to N has become unstable. When K hears any neighbor has a valid route to i, the valid route replaces the invalid one.

The way in which a route is marked invalid is to set the value of the dist to the route to infinity. In practice, RIP uses a value of 16 to equal infinity.


Counting to Infinity One of the more significant problems with RIP is its slow

convergence to change in topology. Consider the configuration below, with all links equal to one. B maintains a distance to network 5 of 2, with a next hop of D. A and C both maintain a distance to network 5 of 3, with a next hop to B. Now pose that router D fails. Then the following scenario could occur:


Counting to Infinity 1. B determines that network 5 is no

longer reachable via D and sets its distance count to 4, based on a report from A or C. This happens because B has recently received a report from both A and C that D is reachable with a dist of 3. At the next reporting interval, B advertises this information in distance vectors to A and C.

2. A and C receive this increased distance information from D and increment their reachability information to network 5 to a distance of 5 (4 from B plus I to reach B).

3. B receives the distance count

of 5 and assumes that network 5 is now a distance 6 away.


Counting to Infinity This pattern continues until the distance

value reaches infinity, which is only 16 in RIP. Once this value is obtained, a node determines that the target network is no longer reachable. With a reporting interval of 30 seconds, this type of condition can take from 8 to 16 minutes to resolve itself.


Distance Vector Routing: Solving the Looping

ProblemCount to infinity problem: loops will exist in Count to infinity problem: loops will exist in

tables until table values "count up" to cost of tables until table values "count up" to cost of alternate routealternate route

Split Horizon Algorithm:Split Horizon Algorithm: rule: if A routes traffic to Z via B then A tells B rule: if A routes traffic to Z via B then A tells B

its distance to Z is infinity its distance to Z is infinity example: B will never route its traffic to Z via A example: B will never route its traffic to Z via A does not solve the count to infinity problem does not solve the count to infinity problem

(why)? (why)?


Split Horizon Technique Concept : When a node sends routing updates to

its neighbors, it does not send those routes it learned from each neighbor back to that neighbor.

E.g. if B has the route ( E,2,A) in its table, then it knows it must have learned this route from A whenever B sends a routing update to A, it does not include the route ( E,2) in that update.

Split Horizon with poison reverse is a stronger variation :

B actually sends the route back to A but puts in negative information in the route to ensure that A will not eventually use B to get to E.

E.g. B sends the route ( E, infinity) to A


More problems: Oscillations

A reasonable scenario cost of link depends on amount of traffic carried nodes exchange link costs every T suppose:

A is destination for all traffic B,D send 1 unit of traffic to A C sends e units of traffic (e<<1) to A

Entire network may "oscillate"

Possible solutions: avoid periodic exchange (randomization) don't let link costs be increasing functions of load


Distance Vector Oscillations


Distance Vector: poisoned reverseIf Z routes through Y to get to X : Z tells Y its (Z’s) distance to X is infinite (so Y

won’t route to X via Z) will this completely solve count to infinity

problem? X Z

14

50

Y60

algorithmterminates


Split Horizon with Poisoned Reverse The counting‑to‑infinity problem is caused by a mutual

misunderstanding between B and A (and between B and C. Each thinks that it can reach a network via the other. The split horizon rule in RIP states that it is never useful to send information about a route back in the direction from which it came, as the router sending you the information is nearer to the destination than you are. The split horizon rule does speed things up; an erroneous route will be eliminated within the interval of the 180‑second timeout.

At a small increase in message size, RIP provides even faster

response by the using poisoned reverse. This rule differs from simple split horizon by sending updates to neighbors with a hop count of 16 for routes learned from those neighbors. If two routers have routes pointing at each other, advertising reverse routes with a metric of 16 breaks the loop immediately.


Link State Example: OSPFFlooding Link‑state routing uses a simple technique known as flooding.

This technique requires no network topology information and works as follows : A packet is sent by a source router to every one of its neighbors.

At each router, an incoming packet is retransmitted on all outgoing links except for the link on which it arrived. After the transmission, all routers within one hop of the source have received the packet. the second transmission, all routers within two hops of the source have received packet, and so on.

Unless something is done to stop the incessant retransmission of packets the number of packets in circulation just from a single source packet grows without bound.

One way to prevent this is for each node to remember packets it has already retransmitted. When duplicate copies of the packet are received, they are discarded.

This is the method used in OSPF.


Link State Example: OSPF The flooding technique has three remarkable properties:

All possible routes between source and destination are tried. No matter what link or node outages have occurred, a packet will always get through if at least one path between source and destination exists.

Because all routes are tried, at least one copy of the packet to arrive at the destination will have used a minimum‑delay route.

All nodes that are directly or indirectly connected to the source node are visited.

Because of the first property, the flooding technique is highly robust. Because of the second property, flooded information reaches all routers quickly. Because of the third property, all routers receive the information needed to create a routing table.


Link State Example: OSPF The principal disadvantage of flooding is the high

traffic load that it generates, which is directly proportional to the connectivity of the network.

OSPF Overview

Each router maintains descriptions of the state of its local links to subnetworks and from time to time transmits updated state information to all of the routers of which it is aware.

Every router receiving an update packet must acknowledge it to the sender. Such updates produce a fair amount of routing traffic because the link descriptions, though small, often need to be sent.


Link State Example: OSPF Each router maintains a database that reflects the known

topology of the configuration. The topology is expressed as a directed graph. The graph consists of the following:

Edges of two types:

Graph edges that connect two router vertices when the corresponding routers are connected to each other by a direct point‑to‑point link.

Graph edges that connect a router vertex to a network vertex when the router is directly connected to the network.

Vertices, or nodes, of two types:

A. Router B. Network, which is, in turn,

of two types: Transit if it can carry data

the neither originates nor terminates on an end system attached to this network

Stub if it is not a transit network


Link State Example: OSPFOSPF Packet Format Whereas RIP packets are sent using UDP over IP,

OSPF functions directly over IP. Thus, an OSPF packet is sent as the payload of an IP packet. On a point‑to- point network or a broadcast network (e.g., a LAN), the IP packet containing an OSPF packet contains a standard multicast IP address of 224.0.0.5, which refers to OSPF routers. On non- broadcast networks, specific destination IP addresses are used, with the addresses configured into the router ahead of time.

All OSPF packets use the same 24‑byte header.


Comparison of LS and DV algorithmsMessage complexity LS: with n nodes, E links,

O(nE) msgs sent each DV: exchange between

neighbors only convergence time varies

Speed of Convergence LS: O(n**2) algorithm

requires O(nE) msgs may have oscillations

DV: convergence time varies may be routing loops count-to-infinity problem

Robustness: what happens if router malfunctions?

LS: node can advertise

incorrect link cost each node computes only

its own tableDV:

DV node can advertise incorrect path cost

each node’s table used by others

• error propagate thru network


Hierarchical Routing

scale: with 50 million destinations:

can’t store all dest’s in routing tables!

routing table exchange would swamp links!

administrative autonomy

internet = network of networks

each network admin may want to control routing in its own network

Our routing study thus far - idealization all routers identical network “flat”… not true in practice


Hierarchical Routing aggregate routers into

regions, “autonomous systems” (AS)

routers in same AS run same routing protocol “inter-AS” routing

protocol routers in different AS

can run different inter-AS routing protocol

special routers in AS run inter-AS routing

protocol with all other routers in AS

also responsible for routing to destinations outside AS run intra-AS routing

protocol with other gateway routers

gateway routers


Intra-AS and Inter-AS routingGateways:

•perform inter-AS routing amongst themselves•perform intra-AS routers with other routers in their AS

inter-AS, intra-AS routing in

gateway A.c

network layerlink layer

physical layer

a

b

ba

aC

A

Bd

A.aA.c

C.bB.a

cb

c


Intra-AS and Inter-AS routing

Host h2a

b

ba

aC

A

Bd c

A.aA.c

C.bB.a

cb

Hosth1

Intra-AS routingwithin AS A

Inter-AS routingbetween A and B

Intra-AS routingwithin AS B

We’ll examine specific inter-AS and intra-AS Internet routing protocols shortly


The Internet Network layer

routingtable

Host, router network layer functions:

Routing protocols•path selection•RIP, OSPF, BGP

IP protocol•addressing conventions•datagram format•packet handling conventions

ICMP protocol•error reporting•router “signaling”

Transport layer: TCP, UDP

Link layer

physical layer

Networklayer


IP Addressing IP address: 32-bit

identifier for host, router interface

interface: connection between host, router and physical link router’s typically have

multiple interfaces host may have multiple

interfaces IP addresses associated

with interface, not host, router

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2223.1.3.1

223.1.3.27

223.1.1.1 = 11011111 00000001 00000001 00000001

223 1 11


IP Addressing IP address:

network part (high order bits)

host part (low order bits)

What’s a network ? (from IP address perspective) device interfaces with

same network part of IP address

can physically reach each other without intervening router

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2223.1.3.1

223.1.3.27

network consisting of 3 IP networks(for IP addresses starting with 223, first 24 bits are network address)

LAN


IP Addressing

How to find the networks?

Detach each interface from router, host

create “islands of isolated networks

223.1.1.1

223.1.1.3

223.1.1.4

223.1.2.2223.1.2.1

223.1.2.6

223.1.3.2223.1.3.1

223.1.3.27

223.1.1.2

223.1.7.0

223.1.7.1223.1.8.0223.1.8.1

223.1.9.1

223.1.9.2

Interconnected system consisting

of six networks


IP Addresses

0network host

10 network host

110 network host

1110 multicast address

A

B

C

D

class1.0.0.0 to127.255.255.255128.0.0.0 to191.255.255.255192.0.0.0 to239.255.255.255240.0.0.0 to247.255.255.255

32 bits1111 Reserved for future use E

byte 1 | byte 2 | byte 3 | byte 4


IP Addressing Class A address 24 bits to define hostid

2 24 = 16,777,216 hosts Class B address 16 bits for netid & 16 bits for

hostid ( netid begins 10 ) networks and 2 16 hosts ( 65,536

) Class C address 24 bits for network and 8 bits

for host more networks , each with fewer hosts 2,097, 152 networks with up to 256 hosts each)

Routers belong to more than one network – else cannot connect distinct networks


Special Addresses Network Address: class A, B, C :

no host is assigned a hostid with all 0’s – this is reserved to define the network address

If hostid is all 1’s , then address is called a direct broadcast address – used by router to send a packet to all hosts on that network and can only be used in a destination address

If first byte is 127 –used a a loopback address to test system software. In this case , packet never leaves the machine• E.g. ping to see if capable of receiving and

processing a packet IP addressing for Classes A,B and C have 2 levels

of hierarchy


Subnetting in IP Addresses Subnetting : a network is divided into smaller subnets – each

having its own subnet address. once a datagram arrives at specific router, the interpretation

of the IP address changes. E.g. datagram destined for 141.14.2.21 arrives at a router

that knows 141.14 ( a network) is divided into 3 subnets this datagram must be going to subnet 2 and host 21 on that subnet

Adding subnets adds an intermediate hierarchy to the IP addressing system : 1st level : network id – defines the site 2nd level : subnetid – defines the physical subnet 3rd level : hostid – defines the connection of host to subnet

Routing in IP now requires 3 steps : delivery to the site, delivery to the subnet, delivery to the host.


Masking Masking : a process that extracts the address of a physical

network from an IP address Masking can be used with or without subnetting

With mask of 255.255.0.0 / 255.255.255.0 and IP of 141.14.2.21• Without subnets, get 141.14.0.0 for network address• With subnetting, get 141.14.2.0 for subnet address• Part of mask with all 1’s corresponds to netid /netid&subnetid ; 0’s to

hostid 141.14.2.21 ( no subnetting) 10001101 00001110 00000010 0010101 IP address 11111111 11111111 00000000 00000000 mask Network

address

141.14.2.21 ( no subnetting) 10001101 00001110 00000010 0010101 IP address 11111111 11111111 11111111 00000000 mask 10001101 00001110 00000010 00000000 subnetwork

address

10001101 0001110 00000000 00000000


Subnetting …. An Example An organization with a class A address needs at least 1,000

subnetworks. What is an appropriate subnet mask and the configuration of each network ? Need at least 1,000 subnets need at least 1002 networks

because of reserved ( all 1’s and all 0’s subnets) minimum number of bits allocated for subnetting must be

10• ( 2 9 < 1002 < 2 10 ) 14 bits remain to define host ID’s

Mask

255. 255. 192. 0

11111111

11111111 000000 0000000011

11

This divides the original address range available for hostid’s into 1024 ranges, with 2 ranges reserved for special

addresses

There are 16,384 addresses available in each range – with some reserved


Example (cont.)

X.0.0.0 X.0.0.1 … X.0.63.254 X.0.63.255

X.0.64.0 X.0.0.1 … X.0.127.254 X.0.127.255

X.0.128.0 X.0.128.1 … X.0.191.254 X.0.191.255

...X.255.128.0 X.255.128.1 … X.255.191.254 X.255.191.255

X.255.192.0 X.255.192.1 … X.255.255.254 X.255.255.255


Broadcast RoutingBroadcasting: sending a packet to all N receivers Why concerned with this issue ?

routing updates in LS routing service/request advertisement in application layer (e.g., Novell)

How to do Broadcasting ?

Broadcast algorithm 1: N point-to-point sends send packet to every destination, point-to-point wasteful of bandwidth requires knowledge of all destinations

Broadcast algorithm 2: flooding when node receives a broadcast packet, send it out on every link node may receive many copies of broadcast packet, hence must

be able to detect duplicates


Broadcast Routing: Reverse Path

Forwarding Goal: avoid flooding duplicates

Assumptions: A wants to broadcast all nodes know predecessor node on shortest path

back to A

Reverse path forwarding: if node receives a broadcast packet

if packet arrived on predecessor on shortest path to A, then flood to all neighbors

otherwise ignore broadcast packet - either already arrived, or will arrive from predecessor


Reverse Path Forwarding flood if packet arrives from source on

link that router would use to send packets to source

otherwise discard

rule avoids flooding loops

uses shortest path tree from destinations to source (reverse tree)


SS xx

yyzz

wwto groupto group

routing tableto use

S y


B C

A

E D

B C

A

E D

B C

A

E D

B C

A

E D


Distributing routing information Q: is a broadcast algorithm like reverse path forwarding

good for distributing Link State updates (in LS routing)?

A: First try (at LS broadcast distribution): each router keeps a copy of most recent LS packet (LSP)

received from every other node upon receiving LSP(R) from router R:

if LSP(R) not identical to stored copy then store LSP(R), update LS info for R, and flood LSP(R) else ignore duplicate

How can this protocol fail?


How could this Fail ? Reverse Path Forwarding uses routing

table information may not want to use routing table info. To update routing table information may want to try a broadcast algorithm

that does not explicitly use this table info.

If there is no ordering then an update arrives followed by an “ old update” if different, store it and propagate it old stuff overwriting more current and

correct state info.


2nd Try (at LS Broadcast Distribution)

Each router puts a sequence number on its LSP's

upon receiving LSP(R) from R if (seq # > seq # of stored copy ) of LSP(R) then store LSP(R), update LS info for R, and

flood LSP(R) else ignore duplicate

How can this protocol fail?


Fail Again ? As earlier, sequence numbers could

wrap around if not properly chosen

Use sufficiently large sequence number space


3rd Try (at LS Broadcast Distribution) use "large" sequence numbers add time-based "age" field

each router decreases age field value as LSP(R) sits in memory

locally timeout (forget) LSR(R) routing info if age is zero

don't flood packet with age zero remove queued (for outgoing

transmission) but unsent LSP(R) before flooding newer LSP(R)


Multicast RoutingGOAL: deliver packet from one sender to many

(but not all) other hosts this is harder than broadcast or unicast

deliver to M hosts in N-host network (M<N) option 1: sender establishes M point-to-point

connections option 2: sender sends one packet, which is

duplicated and forwarded, as needed by routers: router A duplicates packet router B selectively forwards


Multicast Group Address M-cast group address “delivered” to all receivers

in the group Internet uses Class D for m-cast

(224.0.0.0 - 239.255.255.255) M-cast address distribution etc. managed by

IGMP Protocol


2 Protocol “types” for Multicast LAN-to-router

Internet Group Mgmt Protocol (IGMP) router-to-router

Distance Vector Multicast Routing Protocol (DVMRP) Protocol Independent Multicast (PIM)

• sparse mode• dense mode

Core Based Trees (CBT)

Others (MOSPF, etc.)

IGMP

Rtr-2-rtr


IGMP Protocol

IGMP (Internet Group Management Protocol) operates between Router and local Hosts, typically attached via a LAN (e.g., Ethernet)

Router queries the local Hosts for m-cast group membership info

Router “connects” active Hosts to m-cast tree via m-cast protocol


IGMP cont’d

Hosts respond with membership reports randomly delays response first Host which responds (at random) speaks for all

Host issues “leave-group” msg to leave optional since router periodically polls anyway (soft

state concept) “leave-group” causes immediate poll


IGMP message typesIGMP Message type Sent by Purpose

membership query: general router query for current active multicast groups

membership query: specific router query for specific m-cast group

membership report host host wants to join goup

leave group host host leaves the group


Router-to-router m-cast protocols Problem: find the best (e.g., min cost)

tree which interconnects all the members


Multicast Routing

A

BC

E

D

A.2 is multicasting message to D.2 ,D.3 and E.1

send to a router ONLY if it has an attached host that is to receive the message

Host in Group router

Source


Basic Multicast Routing Protocols

Multidestination Addressing

AABB

CC

DD

EE

SourceSource

to (A,B,C,D,E)to (A,B,C,D,E)

to (A,B)to (A,B)

to (A,B)to (A,B)to (B)to (B)

to (C,D,E)to (C,D,E)

to (D,C)to (D,C)

to (C)to (C)


Multicast Abstraction multicast address associated with multicast group

hosts join/leave multicast group

sender sends packet to multicast address (destination)

routers deliver to hosts that joined group address

sender does not have to belong to multicast group


Multicast Routing : Link StateGoal : route packets to all hosts that have joined the

multicast address

Each router uses link state flooding to distribute : cost to neighbors multicast addresses it ( as a router ) wants to receive ( has

attached host joined to that group)

Each router computes: regular unicast routing spanning tree for M nodes joined to each active multicast

address spanning tree : set of links connecting M odes together in a

tree


Flooding Link State updates are flooded to all OSPF routers in the

area Multicast: node receives a packet destined for a multicast

destination • 1st time seen transmit on all outgoing interfaces• No refuse packet

How decide ? • OSPF looks at link state database and compares update with

current information – if newer accept, update and forward; else drop

Process is greedy as to memory resources and transmission resources ( uses all available paths)


Multicast Routing :Link State (cont.) A possible spanning tree algorithm :

find node with the “smallest” host address among the M nodes

use Dijkstra’s algorithm to build a tree, rooted at that host that reaches all M nodes

A

BC

E

D1

1

1

2


Spanning Trees More efficient than Flooding Mark some links as ones to use and others as not in use

Create an overlay network Selected links form a loop free subgraph that spans all

nodes in graph In this model:

When a multicast packet arrives immediately set to all outgoing interfaces that are part of the spanning tree (except the incoming link)

Drawbacks ?• Does not look at Group Membership• Concentrates traffic on a “ small “ number of network

nodes


Spanning Tree ForwardingSpanning Tree Forwarding Shared or Source-Based

Basic Multicast Routing Protocols

AABB

CC

DDEE

SourceSource


Shared Tree VS Source-Based Tree

RPF routes over source-based treesource-based tree good delay properties per source overhead

spanning tree forwarding uses shared shared treetree per group overhead higher delays more Traffic Concentration


Multicast Routing:Distance Vector Distance Vector Multicast Routing Protocol

an enhancement of Reverse Path Forwarding that :• uses Distance Vector Routing Packets for building tree• prunes broadcast tree links that are not used (non-membership

reports) Packet arrives with multicast address m

look at the sender address use reverse path forwarding to flood m to downstream routers

that want this multicast packet (i.e. have this mcast address)

Pruning : how to know if “downstream” router wants the multicast packet

• host informs nearest router of the mcast addresses it wants ( to receive)

• router X tells router Y• if X wants mcast address and Y is on X’s shortest path to source ,

then Y wants that address


Multicast Routing:Distance Vector

A

BC

E

D1

1

1

2

•E will forward A’s mcast pkts to D•A will not forward E’s mcast pkts to B•B will not forward any mcast pkts to C


Reverse Path Forwarding When a multicast packet is received ( i.e. headed to a

multicast destination):1. Note the Source S and interface I2. IF I belongs to Shortest Path from S then forward on oll

interfaces except I3. Else, refuse the packet

Uses no more “ resources” than unicast• But do recompute shortest paths

Each router stores the state information on a per-multicast-group per-source basis

different RPF trees at A,C and E ( example)

i. Fast , as always using shortest pathsii. Different trees packets spread over multiple links and

better network utilization


Multicast Forwarding in DVMRP

1. check incoming interface: discard if not on shortest path to source

2. forward to all outgoing interfaces

3. don’t forward if interface has been pruned A leaf ( after flooding) with no attached subscribers sends

a “ prune me” message to node that sent it the packet Do not send me anything from Source S to group G on

Interface I

4. prunes time out every minute


DVMRP Forwarding (cont.)

Basic idea is to flood and prune

R

R

router

S

noreceiver

packet


DVMRP Forwarding (cont.)

Prune branches where no members and branches not on shortest paths

R

R

S

prune

2nd packet


Link State Multicast Routing

Link-State Multicast Routing

routers maintain topology DBs

group-membership / link-state broadcast by routers to advertise links with members

routers compute and cache pruned SPTs

Documents

Chapter 4: Network Layer