CSCI 547 Network Layer 4-1
Chapter 4Network Layer
Read the paper (IP Addressing) http://www.3com.com/other/pdfs/infra/corpinfo/en_US/501302.pdf#search=%22understanding%20ip%20addressing%20everything%22
A excellent source for IP addressing and subnetting & routing
Also read Chapter 3 of http://www.redbooks.ibm.com/redbooks/pdfs/gg243376.pdf
Computer Networking: A Top Down Approach ,4th edition. Jim Kurose, Keith RossAddison-Wesley, July 2007.
CSCI 547 Network Layer 4-2
Chapter 4: Network Layer
Chapter goals: understand principles behind network
layer services: network layer service models forwarding versus routing how a router works routing (path selection) dealing with scale advanced topics: IPv6, mobility
instantiation, implementation in the Internet
CSCI 547 Network Layer 4-3
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-4
Network layer: Network layer functions
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
Application-layer protocolsdefine when and howmessages are sent Transport-layer protocolsdeliver data between processes on different end-systems
o Transport protocols execute only on end systems
Network-layer protocolsdeliver data from one end-system to another
oHop-to-hop rather than end-to-endoNetwork layer protocols execute on every end-systems and routers
Virtual end-to-end transport
CSCI 547 Network Layer 4-5
The Network Layer: Network layer functions(Packet switching)
The network-layer provides four important functions: Addressing : the means by which end systems identify each other Path determination (routing) : the route taken by packets from source to destination Switching (forwarding) : the movement of packets from an input interface to an appropriate output interface Call setup & termination : The establishment of a virtual circuit from sender to receiver in case of a connection-oriented service-- ATM, frame relay, X.25
CSCI 547 Network Layer 4-6
Interplay between routing and forwarding
1
23
0111
value in arrivingpacket’s header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Routing algorithm updates the forwarding table
Routing = ?
Forwarding = ?
Act of updating routing table usually via talking to other routers
Act of looking up routing table and sending(forwarding) to another router
CSCI 547 Network Layer 4-7
Connection setup
Connection-oriented network architectures: ATM, frame relay, X.25
before datagrams flow, two end hosts and intervening routers establish virtual connection routers get involved in setting up the connection
network vs transport layer connection service: network: between two hosts (may also involve
intervening routers in case of VCs) transport: between two processes
CSCI 547 Network Layer 4-8
Network service model
What service model should be provided for transporting packets from sender to receiver? In other words, what aspects should we consider in designing network layer?
Some possible elements of a service model: Guaranteed bandwidth Guaranteed delay Preservation of inter-packet timing (guaranteed maximum jitter)—end-to-end—geared toward time-sensitive traffic Loss-free delivery In-order delivery Congestion feedback to senderDoes IP protocol provide any of the above? NO !
CSCI 547 Network Layer 4-9
Network layer service models:
NetworkArchitecture
Internet(IP)
ATM
ATM
ATM
ATM
ServiceModel
best effort
CBR
VBR
ABR
UBR
Bandwidth
none
constantrateguaranteedrateguaranteed minimumnone
Loss
no
yes
yes
no
no
Order
no
yes
yes
yes
yes
Timing
no
yes
yes
no
no
Congestionfeedback
no (inferredvia loss)nocongestionnocongestionyes
no
Guarantees ?
Connection-oriented protocols can do all of these!
Constant Bit Rate (CBR)Variable Bit Rate (VBR)Available Bit Rate (ABR)Unspecified Bit Rate (UBR)
CSCI 547 Network Layer 4-10
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-11
Connection-Oriented or Connectionless ?
Datagram(DG) network provides network-layer connectionless service--IP
Virtual Circuit(VC) network provides network-layer connection-oriented service—Frame Relay, X.25
analogous to the transport-layer services, but: service: should provide host-to-host—same
with transport layer in this respect no choice: network either provides or not—
transport layer has choice between TCP or UDP
implementation: in network core
CSCI 547 Network Layer 4-12
Connection-oriented: Virtual Circuits
call setup for each call before data can flow, and teardown when done
each packet carries VC identifier (not destination host address)
every router on source-dest path maintains “state” for each passing connection—connection management—flow control, windowing, etc.
link, router resources (bandwidth, buffers) may be allocated to VC (dedicated resources = predictable service) QOS
“source-to-dest path behaves much like telephone circuit” performance-wise network actions along source-to-dest path
CSCI 547 Network Layer 4-13
VC implementation
a VC consists of:1. path from source to destination2. VC numbers, one number for each link along
path—number changes hop-to-hop3. entries in forwarding tables in routers along
path—an entry fixed until VC teardown packet belonging to VC carries VC number
(rather than full destination address) VC number can be changed on each link.
VC number changed according to the entry in forwarding table
CSCI 547 Network Layer 4-14
ATM cell format: Virtual Circuit identifier
GFC--Generic flow control (000=uncontrolled access). VPI--Virtual path identifier. VCI--Virtual channel identifier. Together, the VPI and VCI comprise the VPCI. These fields represent the routing information within the ATM cell.PTI--Payload Type Indication. CLP--Cell Loss Priority.HEC--Header Error Control.
GFC VPI
8 bits
VPI VCI
VCI
VCI PTI(3 bits) CLP
HEC
48 BYTESPAYLOAD
VC identifier
CSCI 547 Network Layer 4-15
Forwarding table
12 22 32
1 23
VC number
interfacenumber
Incoming interface Incoming VC # Outgoing interface Outgoing VC #
1 12 3 222 63 1 18 3 7 2 171 97 3 87… … … …
Forwarding table innorthwest router:
Routers maintain connection state information for each Virtual circuit and also do windowing—large overhead!
CSCI 547 Network Layer 4-16
Virtual circuits: signaling protocols
used to setup, maintain, teardown VC using control packets used in ATM, frame-relay, X.25 VC & signaling protocol are used only along parts of Internet
Used primarily near the network core where a connection-oriented service (e.g. ATM) is used
The signaling protocols are not part of IP
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
1. Initiate call 2. incoming call
3. Accept call4. Call connected5. Data flow begins 6. Receive data7. Discconnect request 8. Discconnect confirm
What happens when a packet is hit by noise
on a link?
CSCI 547 Network Layer 4-18
Datagram networks--connectionless
no call setup at network layer routers: no state about hop-to-hop or end-to-end
connections no network-level concept of “connection”
packets forwarded using destination host address Packets routed independently each other packets between same source-dest pair may take different
paths
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
1. Send data 2. Receive data
Advantage: Much less overhead than VC Disadvantages?unreliable, no
flow control, no error control
CSCI 547 Network Layer 4-19
Forwarding table
Destination Address Range Link Interface
11001000 00010111 00010000 00000000 through 0 11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000 through 1 11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000 through 2 11001000 00010111 00011111 11111111
otherwise 3
For 32 bit addresses, 4 billion possible entries to have an entry for every address for every router, so better way is
Above will have the same effect as
CSCI 547 Network Layer 4-20
Longest prefix matching: the idea used in Internet(glimpse of how routing works)
Prefix Match Link Interface 11001000 00010111 00010 0 11001000 00010111 00011000 1 11001000 00010111 00011 2 otherwise 3
DA: 11001000 00010111 00011000 10101010
Examples
DA: 11001000 00010111 00010110 10100001 Which interface?
Which interface?
CSCI 547 Network Layer 4-21
Datagram or VC network: why?
Internet (datagram) data exchange among
computers “elastic” service, no
strict timing required “smart” end systems
(computers) can adapt, perform
control, error recovery simple inside network,
complexity at “edge” many link types
different characteristics uniform service difficult
ATM (VC) evolved from telephony--
BISDN human conversation:
strict timing, reliability requirements
need for guaranteed service
“dumb” end systems telephones complexity inside
network
CSCI 547 Network Layer 4-22
Connection-oriented(VC) vs Conncectionless(DG)
ISSUES Connection-oriented Connectionless-------------------------------------------------------------------------------------------------------------------Initial setup Required Noand termination
Routing Routing only done on Each packet routedinitial VC setup independently
Connection state Routers keep state info. Router do not hold state info.
for each connection
Need for Needed during initial setup Full address neededFull address Afterwards only VC# always
needed
Packet Guaranteed Not guaranteedSequencing Error recovery Handles error conditions Left to a higher layer
CSCI 547 Network Layer 4-23
ISSUES Connection-oriented Connectionless-------------------------------------------------------------------------------------------------------------Congestion control Easy Difficult
QOS Easy Difficult
Flow Control Handles Not done
Overhead High Low
Examples: TCP UDP, IP, IPX, ISO-IP
Connection-oriented(VC) vs Conncectionless(DG)—Cont’d
CSCI 547 Network Layer 4-24
Chapter 4: Network Layer
4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-25
Router Architecture Overview
Two key router functions:
run routing algorithms/protocol (RIP, OSPF, BGP) forwarding datagrams from incoming to outgoing link
Explore this first
CSCI 547 Network Layer 4-26
Input Port Functions: Centralized switching
Centralized switching: Input port forward to Routing processor Routing processor perform forwarding
table lookup Forwards the packet to output port This approach used for low end routers,
workstations or servers acting as routers
Physical layer:bit-level reception
Data link layer:e.g., Ethernetsee chapter 5
Routing processor
Routing Table
Output port
CSCI 547 Network Layer 4-27
Input Port Functions: Decentralized switching
Decentralized switching: given datagram’s dest., input port
processor lookup output port using forwarding table(a shadow copy in each input port) in input port memory—some routers use Content Addressable Memory for faster lookups—e.g. Cisco 8500 has 64K CAM for each input port
goal: complete input port processing at ‘line speed’—needs parallel processors
queuing: if datagrams arrive faster than forwarding rate into switch fabric
Physical layer:bit-level reception
Data link layer:e.g., Ethernetsee chapter 5
CSCI 547 Network Layer 4-28
Three types of switching fabrics
1: 2:
3:
CSCI 547 Network Layer 4-29
Switching Via Memory
First generation routers: earlier routers (often just a computer) with switching under direct control of CPU
packet copied to system’s memory speed limited by memory bandwidth (2 system bus crossings per datagram)
many modern routers still use but in a shared memory multiprocessor mode—a processor for each input port
InputPort
OutputPort
Memory
System Bus
e.g. CISCO Catalyst 8500
CSCI 547 Network Layer 4-30
Switching Via a Bus
datagram from input port memory
to output port memory via a shared bus—one packet at a time to output port
bus contention: switching speed limited by bus bandwidth
1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (not fast enough for regional or backbone routers)
CSCI 547 Network Layer 4-31
Switching Via An Interconnection Network
overcome bus bandwidth limitations Banyan networks & other
interconnection nets initially developed to connect processors in multiprocessor
Advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric and reassemble at receiving end
Fastest, most expensive Cisco 12000 series: switches up to 1.28
tera bps through the interconnection network
Horizontal bus—no contention
Vertical bus—contention
. ..
.
.
.
See this
CSCI 547 Network Layer 4-32
Cisco 12000 Crossbar switch
CSCI 547 Network Layer 4-33
Output Ports
Buffering required when datagrams arrive from switching fabric faster than the transmission rate
Scheduling discipline chooses among queued datagrams for transmission—depends upon protocols;
If IP, then First-In-First-OutIf ATM, then depending on the QOS
Constant Bit Rate (CBR)Variable Bit Rate (VBR)Available Bit Rate (ABR)Unspecified Bit Rate (UBR)
CSCI 547 Network Layer 4-34
Output port queueing
buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow!—
Assuming Switch operates 3 times the speed of line speed
CSCI 547 Network Layer 4-35
Input Port Queuing
Fabric slower than input ports combined -> queueing may occur at input queues
Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward
queueing delay and loss due to input buffer overflow!
CSCI 547 Network Layer 4-36
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-37
TCP/IP Protocol Suite
Defined in T
CP/IP Protocol S
uiteU
nd
efine
d
7
6
5
4
3
2
1
OS
I l
ay
ers
TC
P/I
P l
ay
ers
Ap
plic
atio
n
DomainName
System
DNSDDNS
FileSharing
NFS
BOOTP
DHCP
HostConfigura
-tion
NetworkManage-
ment
RMON
SNMP
FileTransfer
FTP
TFTP
E-Mail &News
MIME
SMTP
POP/IMAP
NNTP
WWW &Gopher
Interactiveaccess
HTTPHTTPS
Telnet
“r” Com-mands(rlogin,rsh, ,,)
Gopher
IRC
Transport TCP(Transmission Control Protocol)
UDP(User Datagram Protocol)
RARP(Reverse Address Resolution Protocol)
ARP(Address Resolution Protocol)
InternetIPv4IPv6
IP NAT
IP Sec
Mobile IP
IP RoutingProtocols
ICMPv4ICMPv6
ND(NeighborDiscovery)
IP supportProtocols
RIP, OSPF, GGP,HELLO, IGRP, EIGRP,
BGP, EGP
NetworkInterface
(Data Link)
SLIP(Serial Line
Interface Protocol)
PPP(Point-to-Point
Protocol)
LAN/WLAN/WANHardware Drivers
(Ethernet, FDDI, Wireless,ATM, Frame Relay, …)
PhysicalMedium
UTP/Coax/Wireless/Infrared/Fiber Optics/Microwave/Satellite
CSCI 547 Network Layer 4-38
IP
3 InternetIPv4IPv6
IP NAT
IP Sec
Mobile IP
IP RoutingProtocols
ICMPv4ICMPv6
ND(NeighborDiscovery)
IP supportProtocols
RIP, OSPF, GGP,HELLO, IGRP, EIGRP,
BGP, EGP
CSCI 547 Network Layer 4-39
The IP (Internet Protocol) RFC 791
routingtable
Network layer functions of hosts & routers :
Routing protocols•path selection•RIP, OSPF, BGP
IP protocol•addressing conventions•datagram format•packet handling conventions
ICMP protocol•error reporting•router “signaling”
Transport layer: TCP, UDP
Link layer
physical layer
Networklayer
CSCI 547 Network Layer 4-40
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-41
IP datagram format: RFC 791
ver length
32 bits
data (variable length,typically a TCP
or UDP segment)
16-bit identifier
header checksum
time tolive
32 bit source IP address
IP protocol versionNumber: IPv4
header length (32 bit words)
max numberremaining hops
(decremented at each router)
Used forfragmentation/reassembly
total datagramlength (bytes)
upper layer protocolto deliver payload to
head.len
type ofservice
“type” of data flgsfragment
offsetProto-
col
32 bit destination IP address
Options (if any) E.g. timestamp,record routetaken, specifylist of routers to visit.
how much overhead with TCP?
20 bytes of TCP 20 bytes of IP = 40 bytes + app
layer overhead
44 8
16
13
8816
For error checking
CSCI 547 Network Layer 4-42
Table from IANA: Assigned Numbers
The Protocol numbers are Service Access Points on IP layer
CSCI 547 Network Layer 4-43
Service Access Points
ICMP
1
TCP
6
UDP
17
IGMP
2
IP
Protocol numbers
x0800
ARP
x0806Ether Types
23
Telnet
80
http
53
DNS
Port numbers
Ethernet
CSCI 547 Network Layer 4-44
MTU (Maximum Transmission Unit)
network links have MTU (Maximum Transmission Unit) - largest possible link-level frame. different link types different MTUs
Given 1500 Byte MTU(Ethernet), what is MSS(Maximum Segment Size) for TCP? 1500 – 20(TCP) -20(IP) = 1460
IP datagram
Maximum Transmission UnitLink layer header Link layer trailer
1500 bytes for Ethernet
Transport layer(TCP/UDP) has MSS—IP layer has MTU(determined by Link layer)
headers
CSCI 547 Network Layer 4-45
Link layer protocol MTU (bytes)
Internet Path MTU At least 576
Hyperchannel 65,535
Token ring(16Mbps) 17,914
FDDI 4,500
Ethernet 1,500
Dial-up, X.25 576
PPPoE 1,492
ATM 48
802.11 2272
MTU (Maximum Transmission Unit)
CSCI 547 Network Layer 4-46
IP Fragmentation & Reassembly
large IP datagram divided (“fragmented”) within net By a router one datagram
becomes several datagrams
“reassembled” only at final destination host—why?
IP header bits used to identify, reorder related fragments
fragmentation: in: one large datagramout: 3 smaller datagrams
reassembly
CSCI 547 Network Layer 4-47
Fragmentation example:
C:\Documents and Settings\Administrator>ping www.google.com -f -l 1400
Pinging www.l.google.com [66.102.7.104] with 1400 bytes of data:
Packet needs to be fragmented but DF set.Packet needs to be fragmented but DF set.Packet needs to be fragmented but DF set.Packet needs to be fragmented but DF set.
Ping statistics for 66.102.7.104: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
Ping from a Windows workstation
-f == Set Don't Fragment flag
-l == Send buffer size.Screen captured on a
Windows system
CSCI 547 Network Layer 4-48
ver length
32 bits
data (variable length,typically a TCP
or UDP segment)
16-bit identifier header
checksumtime to
live
32 bit source IP address
head.len
type ofservice
flgsfragment
offsetProto-
col
32 bit destination IP address
Options (if any)
Fields needed for fragmentation
R DF MFFlags—3 bits
R: Reserved—not used
DF: Don’t Fragment
MF: More Fragment
Fragmentation
CSCI 547 Network Layer 4-49
IP Fragmentation and Reassembly
ID=x
offset=0
moreflag=0
length=4000
ID=x
offset=0
moreflag=1
length=1500
ID=x
offset=185
moreflag=1
length=1500
ID=x
offset=370
moreflag=0
length=1040
One large datagram becomesseveral smaller datagrams
Example 4000 byte
datagram MTU = 1500 bytes
1480 bytes in data field + 20bytesof IP header
offset =1480/8
Fragments are counted in units of 8 octets(bytes).
CSCI 547 Network Layer 4-50
Internal modules of IP Layer: for routers & hosts
Header-composing module
Reassembly module
Routingmodule
Processing module
Fragmentation module
Reassembly table
Routing table
MTU table
To upper layer protocol
DataData & dest. addr.
To data link layer
IP packet, next hop addr.
From data link layer
IP packet
IP packet
IP packet, next hop addr., interface
IP packet
IP packet IP
From upper layer protocol
CSCI 547 Network Layer 4-51
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-52
IP packet format—IPv4
CSCI 547 Network Layer 4-53
IP packet format—IPv4—from RFC791
CSCI 547 Network Layer 4-54
IP Addressing: introduction
IP address: 32-bit identifier for host, router interface
interface: connection between host/router and physical link router’s typically have
multiple interfaces host typically has one
interface IP addresses
associated with each interface
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
223.1.1.1 = 11011111 00000001 00000001 00000001
223 1 11
CSCI 547 Network Layer 4-55
IP Addressing
Classful Addressing
CIDR
VLSM
NAT
1992
Inefficient division into 5 classes—A, B, C, D, EAddress space running out
IPv6—128 bitsIPv4—32bits
Huge address space
More streamlined for efficiency
More Auto-configuration
Accommodates QOS
When?
interim solutions
CSCI 547 Network Layer 4-56
IP Addressing: Classful or Classless
Older IP addressing (and routing) called “Classful IP addressing” which fixes the size of a block to one of classes (A, B, C, D, E)
Out of 5 classes, only 3 classes are assignable to computers as IP addresses—only 3 sizes to fit all organizations of the world?
More elastic (size-wise) scheme is called “CIDR(Classless Inter Domain Routing)”—IP address blocks can be any size using prefix subnet masks—later
Both used but Classless mostly used by ISP level or above
CSCI 547 Network Layer 4-57
IP Addressing: Classful Addressing
Byte 1 Byte 2 Byte 3 Byte 4
Class A
Class B
Class C
Class D
Class E
0
1 0
1 1 0
1 1 1 0
1 1 1 1
Multicast address
Reserved for future addressing modes
netid
netid
netid
hostid
hostid
hostid
7
1614
24
21 8
CSCI 547 Network Layer 4-58
IP Addressing: Dotted Decimal Notation
32 bit IP addresses are often represented in the human-friendly 4 groups of 8 bits and written in decimal number with dots separating them “dotted decimal notation”
1 0 0 0 0 1 0 0
1 1 1 1 0 0 0 1
0 0 1 0 0 0 1 11 0 0 1 1 1 1 0
128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1
132 241 158 35. . .
CSCI 547 Network Layer 4-59
Private IP addresses
The Internet Assigned Numbers Authority (IANA) under ICANN has reserved the following three blocks of the IP address space for private networks: Read here
10.0.0.0 - 10.255.255.255 172.16.0.0 - 172.31.255.255 192.168.0.0 - 192.168.255.255
Also, IP addresses in the range of 169.254.0.0 -169.254.255.255 are reserved for Automatic Private IP Addressing (Zero Configuration Networking).Used for ad hoc or isolated networks
The above IP addresses should not be used on the Internet. Internet routers will not route packets with those addresses—only usable for private networks or Intranets
CSCI 547 Network Layer 4-60
Address ranges
Class Start End Default Subnet Mask
CIDR prefix notation
A 1.0.0.0 127. 255.255.255 255.0.0.0 /8
B 128.0.0.0 191.255. 255.255 255.255.0.0 /16
C 192.0.0.0 223.255.255.255 255.255.255.0 /24
D 224.0.0.0 239.255.255.255 NA NA
E 240.0.0.0 255.255.255.255 NA NA
CSCI 547 Network Layer 4-61
Special Addresses
Some parts of classes A, B, C are used for special addresses:
-----------------------------------------------------------------------------------------------------------------Special Addresses Netid Hostid Source or Destination-----------------------------------------------------------------------------------------------------------------Network address Specific All 0s None(e.g. 132.241.0.0 is the network address for the CSU, Chico LAN)-----------------------------------------------------------------------------------------------------------------Direct broadcast address Specific All 1s Destination(This is used by a router to send a packet to all hosts in a subnet, e.g. 132.241.255.255)-----------------------------------------------------------------------------------------------------------------Limited broadcast address All 1s All 1s Destination(A broadcast address for a subnet--It is used when a host wants to send a message toall the hosts in the local subnet--routers will not pass this to other subnets)-----------------------------------------------------------------------------------------------------------------This host on this network All 0s All 0s Source(All 0s designate “this host on this network”--used by a host when it does not knowit’s own IP address)-----------------------------------------------------------------------------------------------------------------Specific host on this network All 0s Specific Destination(It is used by a host to send a message to another host on the same subnet-- Routers willnot process this kind)-----------------------------------------------------------------------------------------------------------------Loopback address 127 Any Destination(It is used to test the health of TCP/IP protocol on a host, e.g. ping 127.0.0.1)
CSCI 547 Network Layer 4-62
The Needs for subnetting
Given a chunk of addresses (e.g. a class B), an organization usually need to sub-divide the address space in a hierarchical fashion
Just as an organization is structured hierarchically, IP addresses are divided as needed
From http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/ip.htm “Subnetting provides the network administrator with several benefits, including extra flexibility, more efficient use of network addresses, and the capability to contain broadcast traffic (a broadcast will not cross a router). Subnets are under control of local administration. As such, the outside world sees an organization as a single network and has no detailed knowledge of the organization's internal structure. “
Also read http://www.support.psi.net/support/common/routers/files/SUBNET-Desc.html
CSCI 547 Network Layer 4-63
Without Subnetting?
Analogous to: One person in mailroom delivering all mails of the organization
Without subnetting, The entire network (e.g. 132.241.0.0) is connected as one LAN--All workstations should be connected directly to the router—either directly to the router ports or the LAN is connected only through hubs and switches—this is not feasible except for a very small network—less than 100 computers
CSCI 547 Network Layer 4-64
Without Subnetting?
.
.
.
. . .
To Internet
This router should be very fast, should have a large number of ports. Also long cables needed-- Not feasible except for very small network—less than 100 computers?!
CSCI 547 Network Layer 4-65
Subnets
IP address divided: subnetid part (high
order bits) hostid part (low order
bits)
What’s a subnet ? device interfaces with
same subnet part of IP address
Hosts within a subnet can physically reach each other without intervening router
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
network consisting of 3 subnets
subnet
subnetid hostid
CSCI 547 Network Layer 4-66
Subnets223.1.1.0/24
223.1.2.0/24
223.1.3.0/24
Recipe To determine the
subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet.
The host are usually connected to a hub or a switch
Subnet mask: /24
CSCI 547 Network Layer 4-67
Subnets
Decisions to make: How many subnets? How big each
subnet? How many levels of
subnetting?
All depends upon the organizational structure and requirements
223.1.1.1
223.1.1.3
223.1.1.4
223.1.2.2223.1.2.1
223.1.2.6
223.1.3.2223.1.3.1
223.1.3.27
223.1.1.2
223.1.7.0
223.1.7.1223.1.8.0223.1.8.1
223.1.9.1
223.1.9.2
CSCI 547 Network Layer 4-68
Subnetting After acquiring a block of network addresses, e.g. a Class B
address, divide it according to needs Let’s say, we need 6 large divisions first, then each division
may be divided as needed Each division must be equal sizes
1st division
2nd 3rd
CSCI 547 Network Layer 4-69
Subnetting: Classful--example
Given class B address of 132.241.0.0, we need to divide the networks into 255 equal-sized subnets—variable sizing covered later (VLSM)
With a class B address, we are given the last 16 bits to play with(divide)
132 . 241 . 0 . 0
1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1
132 241 0 0. . .netid hostid
Fixed Can be subnetted
CSCI 547 Network Layer 4-70
Subnetting: Classful--example
132 . 241 . 0 . 0
1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1
Fixed Can be subnetted. . .
Where should we put the divider?
# of subnets? # of hosts in a subnet?
The decision should be based upon:
# of subnets needed
# of hosts on each subnets
Future needs
Routing protocol (RIPv1, RIPv2, or OSPF)
CSCI 547 Network Layer 4-71
Subnetting: Classful--example
132 . 241 . 0 . 0
1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1 128
64
32
16
8 4 2 1
Fixed Can be subnetted. . .
Where should we put the divider?
# of subnets? # of hosts in a subnet?
The division is indicated by “subnet mask”—done by putting 1’s until the division point
For example: Let’s put it after 19th bit
1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 01 1 1 0 0 0 0 0
Then the subnet mask should be
255 255 ? 0224
CSCI 547 Network Layer 4-72
Subnetting: Classful
1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 01 1 1 0 0 0 0 0
Then the subnet mask should be
255 255 0224
00241132
netid subnetid hostid
To be precise, we have netid, subnetid, hostid
But the (netid + subnetid) is often called as subnetid
Notation for subnet mask; 2 ways
1) Dotted decimal notation; e.g. 255.255.224.0
2) Prefix notation; e.g. /19 –called slash notation also--19bits are subnet mask
subnetid hostid
subnet mask
CSCI 547 Network Layer 4-73
Example for classful subnetting
Given a class B address of 132.241.0.0/16 Let’s say we decided to divide using next 8
bits1 0 0 0 0 1 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1
Then the subnet mask should be
00241132
subnetid hostid
subnet mask
255 255 255 0
In prefix notation ? 132.241.0.0/24
CSCI 547 Network Layer 4-74
Example for classful subnetting, cont’d Restrictions: legacy routers following rfc 950, do
not recognize all zero’s and all one’s subnet 132.241.0.0/24 & 132.241.255.0/24 --wasted space
In rfc 1878, all zero’s and all one’s subnets are allowed by defaultCan be turned off by no ip subnet-zero command for CISCO routersRead http://www.cisco.com/en/US/tech/tk648/tk361/technologies_tech_note09186a0080093f18.shtml
In hostid portion, all zero’s and all one’s are not allowed—they are reserved for special purpose—all zero’s represent the subnetid & all one’s represent subnet broadcast address; e.g. 132.241.0.0 is the subnetid for subnet 132.241.0.0 and 132.241.0.255 is the broadcast address for the 132.241.0.0 subnet—also see slide 4.61
CSCI 547 Network Layer 4-75
Example for classful subnetting, cont’dLet’s write down all the subnet addresses and host addresses
Subnet addresses Host addresses
132.241.0.0/24 132.241.0.1 -- 132.241.0.254(note that 132.241.0.0 is subnet’s address and 132.241.0.255 is subnet broadcast address)
132.241.1.0/24 132.241.1.1 -- 132.241.1.254
. . . . . .132.241.254.0/24 132.241.254.1 -- 132.241.254.254
132.241.255.0/24 132.241.255.1 -- 132.241.255.254
CSCI 547 Network Layer 4-76
Another example for classful subnetting:Let’s subnet a subnet— Mini Lab
A class B subnet 132.241.0.0 was divided using 8 bit(/24) division as previous example
Now, we are assigned one of the subnets; 132.241.158.0—we want to subnet this subnet
Requirements: We need at least 4 subnets and each subnet should accommodate at least 20 hosts
CSCI 547 Network Layer 4-78
Problems with Classful addressing & subnetting
Only 3 classes useable (A, B, C) Only 3 sizes to satisfy all organizations Address spaces are depleted—not much left
—especially class B (most comfortable fit) Some predictions say the address space will
be exhausted—one predicts in 2008 and the other in 2018—to see current assignments, see http://bgp.potaroo.net/index-ale.html
In classful addressing, the assignment of class C addresses result in a large number of entries in routing table for Internet backbone routers--- http://bgp.potaroo.net/
CSCI 547 Network Layer 4-79
IP adddress assignmentFrom: http://bgp.potaroo.net/ipv4-stats/allocated-all.html
CSCI 547 Network Layer 4-80
Routing table size of Internet backbone routers
CSCI 547 Network Layer 4-81
Solutions for IPv4 address depletion
Short term (interim) solutions: CIDR(Classless InterDomain Routing)—by not
sticking to 3 classes (A,B,C) and their fixed sizes, we can accommodate better fittings to different size organizations – a.b.c.d/x
NAT(Network Address Translation) —Small block of addresses can be timeshared by large number of connections
VLSM(Variable Length Subnet Mask) —Allows intranets use variable sizes for distributing address spaces(rather than the fixed size divisions we saw in the classful subnetting examples)
Long term solution: IPv6 (128 bit IP address)
CSCI 547 Network Layer 4-82
IP Addressing
Classful Addressing
CIDR
VLSM
NAT
1992
Inefficient division into 5 classes—A, B, C, D, EAddress space running out
IPv6—128 bitsIPv4—32bits
Huge address space
More streamlined for efficiency
More Auto-configuration
Accommodates QOS
When?
interim solutions
CSCI 547 Network Layer 4-83
CIDR: Motivation
Observation: Many organizations need larger address than one class C(254), but less than 1000 (<< class B)—they need multiple class C addresses but not class B(remember class B space is depleted)
Assigning multiple class C addresses as a block(aggregation) helps to reduce the effects of the explosion of Internet Backbone routers
Therefore, eliminate the restriction of classes!
RFC 1517, 1518, 1519, 1817
CSCI 547 Network Layer 4-84
IP addressing: CIDR
CIDR: Classless InterDomain Routing Around 1993, CIDR replaced Classful addressing “CIDR is principally a bitwise, prefix-based standard for the
interpretation of IP addresses.”--wikipedia IP address space can have many different sizes—not only 3
sizes! Uses address format: a.b.c.d/x, where x is # bits in subnet
portion of address—prefix notation Class has no meaning!
11001000 00010111 00010000 00000000
subnetpart
hostpart
200.23.16.0/23
subnetpart
hostpart
200.23.18.0/23
11001000 00010111 00010010 00000000
Is 200.23.17.0/23 possible?
CSCI 547 Network Layer 4-85
Commonly used CIDR prefixes
CIDR Prefix # of former class C nets
/28 1/16 (of class C)
/27 1/8
/26 1/4
/25 1/2
/24 1 (class C)
/23 2
/22 4
/21 8
/20 16
/19 32
/18 64
/17 128
/16 256 = 1 class B
/15 512
/14 1024
/13 2048
/12 4096
CSCI 547 Network Layer 4-86
IP addressing: CIDR
CIDR is also called as “supernetting” since we aggregate classful addresses—mostly class B and class C addresses
Example: Let’s say we have the following 4 class C addresses:
• 200.168.4.0 11001000 10101000 00000100 00000000• 200.168.5.0 11001000 10101000 00000101 00000000• 200.168.6.0 11001000 10101000 00000110 00000000• 200.168.7.0 11001000 10101000 00000111 00000000With 255.255.255.0 as subnet mask = prefix ?
Let’s compare with CIDR address of• 200.168.4.0/22
What is the difference? List Class C address blocks given 200.168.8.0/21
CSCI 547 Network Layer 4-87
IP addresses: how to get one?
Q: How does a host get IP address?
Static addressing: hand-coded by system admin in a file Wintel: control-panel->network->configuration-
>tcp/ip->properties UNIX: /etc/rc.config
Dynamic addressing: DHCP: Dynamic Host Configuration Protocol: dynamically get address from as a DHCP server “plug-and-play”
(more in next chapter)
CSCI 547 Network Layer 4-88
IP addresses: how to get one? Using CIDR
Q: How does a network get subnet part of IP addr?
A: gets allocated portion of its provider ISP’s address space
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 ... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23
CSCI 547 Network Layer 4-89
Hierarchical addressing: route aggregation:using CIDR
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16”
200.23.20.0/23Organization 2
...
...
Hierarchical addressing allows efficient advertisement of routing information:
CSCI 547 Network Layer 4-90
route aggregation using CIDR:Reduces size of routing tables for Internet Backbone routers
Growth of BGP table 1994 to present
CSCI 547 Network Layer 4-91
Hierarchical addressing: more specific routes
ISPs-R-Us has a more specific route to Organization 1:This may happen when Organization 1 used to subscribe to Fly-By-Night-ISP but now moved to ISPs-R-Us. Organization 1 wants to keep 200.23.18.0/23! – this is one possible scenarioIn our text, you see another possible scenario—page 337
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7 Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16or 200.23.18.0/23”
200.23.20.0/23Organization 2
...
...
199.31.0.0/16
CSCI 547 Network Layer 4-92
IP addressing: the last word...
Q: How does an ISP get a block of addresses?A: From ICANN: Internet Corporation for Assigned
Names and Numbers allocates addresses manages DNS assigns domain names, resolves disputes
Also see http://www.pch.net/resources/data/WoN/itu-seminar-20040211-1.ppt#480,1,Internet Addressing and the RIR system
CSCI 547 Network Layer 4-93
Solutions to IP address depletion
Short term (interim) solutions: CIDR NAT(Network Address Translation) VLSM(Variable Length Subnet Mask)
Long term solution: IPv6 (128 bits)
CSCI 547 Network Layer 4-94
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
138.76.29.7
local network(e.g., home network)
10.0.0/24
Intranet Datagrams use10.0.0/24 address
Internet Datagrams aresent to the router
( 10.0.0.4 )
All datagrams leaving localnetwork have same single source
NAT IP address: 138.76.29.7,But different source port
numbers
Internet
NAT table here
CSCI 547 Network Layer 4-95
NAT: Network Address Translation
Motivation: local network uses just one IP address as far as outside world is concerned: range of addresses not needed from ISP: just
one IP address for all devices can change addresses of devices in local network
without notifying outside world can change ISP without changing addresses of
devices in local network devices inside local net not explicitly
addressable/ visible by outside world (a security plus).
popularly used by home networks connections to Internet
CSCI 547 Network Layer 4-96
NAT: Network Address Translation
Implementation: NAT router must:
outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #). . . remote clients/servers will respond using
(NAT IP address, new port #) as destination addr.
remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair
incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table
CSCI 547 Network Layer 4-97
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
S: 10.0.0.1, 3345D: 128.119.40.186, 80
1
10.0.0.4
138.76.29.7
1: host 10.0.0.1 sends datagram to 128.119.40.186, 80
NAT translation tableWAN side addr LAN side addr
138.76.29.7, 5001 10.0.0.1, 3345…… ……
S: 128.119.40.186, 80 D: 10.0.0.1, 3345
4
S: 138.76.29.7, 5001D: 128.119.40.186, 80
2
2: NAT routerchanges datagramsource addr from10.0.0.1, 3345 to138.76.29.7, 5001,updates table
S: 128.119.40.186, 80 D: 138.76.29.7, 5001
3
3: Reply arrives dest. address: 138.76.29.7, 5001
4: NAT routerchanges datagramdest addr from138.76.29.7, 5001 to 10.0.0.1, 3345
CSCI 547 Network Layer 4-98
NAT: Network Address Translation
16-bit port-number field: 60,000 simultaneous connections with a single
LAN-side address! Plenty! NAT is controversial:
routers should only process up to layer 3 violates end-to-end argument
• NAT possibility must be taken into account by app designers, eg, P2P applications
address shortage should instead be solved by IPv6
CSCI 547 Network Layer 4-99
NAT: Another form
The NAT described so far is also called as NAPT (Network Address Port Translation)
Another form is “Basic NAT” or “Static NAT”—involves only IP address translation--not ports Router is configured with a pool of IP addresses When a computer having private IP address
wants to connect to Internet, router assigns an IP address from the pool until disconnected
Example usage: An ISP with 1000 users and only maximum 20% are on-line at a time. ISP uses NAT with a class C address(254 IP addresses) to serve all users
CSCI 547 Network Layer 4-100
NAT: Another form
10.0.0.1
10.0.0.2
10.0.0.3
S: 10.0.0.1, 3345D: 128.119.40.186, 80
1
10.0.0.4
138.76.29.7
1: host 10.0.0.1 sends datagram to 128.119.40.186, 80
NAT translation tableWAN side addr LAN side addr
138.76.29.7 10.0.0.1…… ……
S: 128.119.40.186, 80 D: 10.0.0.1, 3345
4
S: 138.76.29.7, 3345D: 128.119.40.186, 80
2
2: NAT routerchanges datagramsource addr from10.0.0.1 to138.76.29.7,updates table
S: 128.119.40.186, 80 D: 138.76.29.7, 3345
3
3: Reply arrives dest. address: 138.76.29.7, 5001
4: NAT routerchanges datagramdest addr from138.76.29.7 to 10.0.0.1Only IP addresses are changed!
Notice that only IP addresses are changed by router
CSCI 547 Network Layer 4-101
Solutions to IP address depletion
Short term (interim) solutions: CIDR NAT(Network Address Translation) VLSM(Variable Length Subnet Mask)
Long term solution: IPv6 (128 bits)
CSCI 547 Network Layer 4-102
VLSM(Variable Length Subnet Mask): rfc 1817
Classful subnetting divides a network into equal sizes at a given level=“one size fits all”
With CIDR, the VLSM was introduced—we can divide a network into different sizes at a given level—more flexible & save addresses
1st division
2nd 3rd
CSCI 547 Network Layer 4-103
VLSM: an example
Uses all 0’s & all 1’s subnets to fully utilze address space
A company is assigned a class C address space and needs the following:
# of subnets subnet size purpose
2 60+ hosts For 2 main offices
4 10+ hosts For 2 branch offices & 2 server farms
The rest 2 hosts PPP (for telecommuting)
1st subnet mask255.255.255.192
/26
X.Y.Z.0/2662 hosts
X.Y.Z.64/2662 hosts
2nd subnet mask255.255.255.240
/28
X.Y.Z.128/2814 hosts
X.Y.Z.144/2814 hosts . .
.
3rd subnet mask255.255.255.252
/30
X.Y.Z.192/302 hosts
X.Y.Z.196/302 hosts
X.Y.Z.252/302 hosts. .
.
X.Y.Z.176/2814 hosts
For another example, visit here
CSCI 547 Network Layer 4-104
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-105
ICMP: Internet Control Message Protocol
used by hosts & routers to communicate network-level control information error reporting:
unreachable host, network, port, protocol
echo request/reply (used by ping)
network-layer but “above” IP: ICMP msgs carried in IP
datagrams—”horizontal layering”
ICMP message: type, code plus first 8 bytes of IP datagram causing error
Type Code description0 0 echo reply (ping)3 0 dest. network unreachable3 1 dest host unreachable3 2 dest protocol unreachable3 3 dest port unreachable3 6 dest network unknown3 7 dest host unknown4 0 source quench (congestion control - not used)8 0 echo request (ping)9 0 route advertisement10 0 router discovery11 0 TTL expired12 0 bad IP header
CSCI 547 Network Layer 4-106
ICMP
3 InternetIPv4IPv6
IP NAT
IP Sec
Mobile IP
IP RoutingProtocols
ICMPv4ICMPv6
ND(NeighborDiscovery)
IP supportProtocols
RIP, OSPF, GGP,HELLO, IGRP, EIGRP,
BGP, EGP
ICMP uses the service of IP to send a control message
CSCI 547 Network Layer 4-107
Traceroute and ICMP: an example usage of ICMP
Source sends series of UDP segments (in Unix) to dest First has TTL =1 Second has TTL=2, etc. Unlikely port number(33434
and up) in unix implementations
Windows systems use ICMP Echo request not UDP
When nth datagram arrives to nth router: Router discards datagram And sends to source an ICMP
message (type 11, code 0) Message includes name of
router & IP address
When ICMP message arrives, source calculates RTT
Traceroute does this 3 times
Stopping criterion UDP segment eventually
arrives at destination host Destination returns ICMP
“host unreachable” packet (type 3, code 3)
When source gets this ICMP, stops.
http://kb.pert.switch.ch/cgi-bin/twiki/view/PERTKB/VanJacobsonTraceroute
CSCI 547 Network Layer 4-108
C:\Documents and Settings\Administrator> tracert www.csuchico.edu
Tracing route to calypso.csuchico.edu [132.241.82.62]over a maximum of 30 hops:
1 6 ms 5 ms 11 ms 208-53-80-5.chico.ca.digitalpath.net [208.53.80.5] 2 7 ms 4 ms 4 ms 198-69-248-1.chico.ca.digitalpath.net [198.69.248.1] 3 9 ms 8 ms 8 ms sl-gw26-stk-5-0-TS9.sprintlink.net [144.232.195.169] 4 8 ms 10 ms 10 ms sl-bb20-stk-8-0.sprintlink.net [144.232.4.114] 5 13 ms 10 ms 9 ms sl-bb20-sj-9-0.sprintlink.net [144.232.20.99] 6 11 ms 9 ms 10 ms sl-bb21-sj-15-0.sprintlink.net [144.232.3.158] 7 10 ms 12 ms 10 ms sl-st20-sj-13-0.sprintlink.net [144.232.9.58] 8 118 ms 11 ms 14 ms so-7-1.car4.SanJose1.Level3.net [209.245.146.245] 9 13 ms 10 ms 12 ms ge-11-0.ipcolo3.SanJose1.Level3.net [4.68.123.43] 10 12 ms 14 ms 10 ms 4.79.44.6 11 14 ms 18 ms 14 ms dc-svl-dc1--isp-1-ge.cenic.net [137.164.22.58] 12 21 ms 22 ms 14 ms dc-oak-dc1--svl-dc1-10ge.cenic.net [137.164.22.31] 13 23 ms 22 ms 24 ms dc-csac-dc1--oak-dc1-ge.cenic.net [137.164.22.111] 14 26 ms 24 ms 24 ms dc-cor-dc1--sac-dc1-ge.cenic.net [137.164.22.153] 15 31 ms 26 ms 25 ms dc-cor-dc2--cor-dc1-df-iconn-1.cenic.net [137.164.22.199] 16 29 ms 27 ms 26 ms dc-csuchico-egm--cor-dc2.cenic.net [137.164.41.26] 17 32 ms 29 ms 35 ms chi-mocha-ge0-0-132.net.CSUChico.EDU [132.241.95.74] 18 40 ms 46 ms 46 ms calypso.CSUChico.EDU [132.241.82.62]
Trace complete.
Result of “tracert www.csuchico.edu” on a Windows host
Round trip time of 3 probes
CSCI 547 Network Layer 4-109
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-110
IPv6 (previously known as IPng (IP next generation))RFC 2460
Initial motivation: 32-bit address space soon to be completely allocated.
Additional motivation: header format helps speed
processing/forwarding header changes to facilitate QoS IPv6 datagram format: fixed-length 40 byte header no fragmentation allowed
CSCI 547 Network Layer 4-111
IPv6 Header (Cont)
Priority: identify priority among datagrams in flowFlow Label: identify datagrams in same “flow.” (concept of“flow” not well defined).Next header: identify upper layer protocol for data
With 128 bits, you can assign over 3.7x10**21 addresses per square inch of the earth's surface.
CSCI 547 Network Layer 4-112
Other Changes from IPv4
Checksum: removed entirely to reduce processing time at each hop
Options: allowed, but outside of header, indicated by “Next Header” field
ICMPv6: new version of ICMP additional message types, e.g. “Packet Too
Big” multicast group management functions
CSCI 547 Network Layer 4-113
Differences Between IPv4 and IPv6
Category IPv4 IPv6
Address length 32 bits 128 bits
Header size 20-60 bytes 40 bytes—fixed size
IPSec support Optional Required
QoS support Limited Better
Fragmentation Done by hosts and routers Done by hosts only
Is a header checksum present? Yes No
Does the header include options? Yes No
Link-layer address resolution Broadcast ARP frames Multicast Neighbor Solicitation
messages
Error reporting and diagnostic protocol ICMP (for IPv4) ICMPv6
Multicast group membership protocol IGMP MLD
Router discovery support Optional Required
Network layer broadcast addresses? Yes No
Host configuration DHCP or manual Automatic, DHCP, or manual
DNS record type for name resolution A record AAAA record
DNS record type and location for reverse name resolution
PTR records in IN-ADDR.ARPA domain
PTR records in IP6.INT domain
CSCI 547 Network Layer 4-114
Transition From IPv4 To IPv6
Not all routers can be upgraded simultaneous no “flag days” feasible How will the network operate with mixed IPv4
and IPv6 routers? Two main approaches:
Dual Stack: A host or router implements both IPv4 and IPv6.
Tunneling: IPv6 carried as payload in IPv4 datagram among IPv4 routers
CSCI 547 Network Layer 4-115
Tunneling
A B E F
IPv6 IPv6 IPv6 IPv6
tunnelLogical view:
Physical view:A B E F
IPv6 IPv6 IPv6 IPv6IPv4 IPv4
CSCI 547 Network Layer 4-116
Tunneling
A B E F
IPv6 IPv6 IPv6 IPv6
tunnelLogical view:
Physical view:A B E F
IPv6 IPv6 IPv6 IPv6
C D
IPv4 IPv4
Flow: XSrc: ADest: F
data
Flow: XSrc: ADest: F
data
Flow: XSrc: ADest: F
data
Src:BDest: E
Flow: XSrc: ADest: F
data
Src:BDest: E
A-to-B:IPv6
E-to-F:IPv6
B-to-C:IPv6 inside
IPv4
B-to-C:IPv6 inside
IPv4
CSCI 547 Network Layer 4-117
Tunneling: another view
From http://www.cisco.com/univercd/cc/td/doc/product/software/ios123/123cgcr/ipv6_c/sa_tunv6.htm
CSCI 547 Network Layer 4-118
IPv6 Deployment?
To see the current deployment, visit
http://bgp.potaroo.net/index-
v6.html
Not likely to happen in the foreseeable future
CSCI 547 Network Layer 4-119
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-120
1
23
0111
value in arrivingpacket’s header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing, forwarding
Difference?
CSCI 547 Network Layer 4-121
u
yx
wv
z2
2
13
1
1
2
53
5
Graph: G = (N,E)
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
Graph abstraction—routing can be analyzed as a graph problem
Remark: Graph abstraction is useful in other network contexts
Example: P2P, where N is set of peers and E is set of TCP connections
nodes edges
CSCI 547 Network Layer 4-122
Graph abstraction: costs
u
yx
wv
z2
2
13
1
1
2
53
5
• c(x,x’) = cost of link (x,x’) - e.g., c(w,z) = 5• cost could always be set to 1(all links has same cost—hop count), or inversely related to bandwidth,or inversely related to Congestion, or …Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)
Question: What’s the least-cost path between u and z ?
Routing algorithm: algorithm that finds least-cost path
CSCI 547 Network Layer 4-123
Routing algorithm design
u
yx
wv
z
2
2
13
1
1
2
5
3
5
Assuming that we can decide the cost of the links
How would routers learn about the weights of the links other than the directly connected links?
How often routers advertise the weights?
The scope of advertisement?
CSCI 547 Network Layer 4-124
Routing Algorithm classification
Global or decentralized information?
Global: all routers have complete
topology, link cost info “link state” algorithmsDecentralized: router knows physically-
connected neighbors, link costs to neighbors
iterative process of computation, exchange of info with neighbors
“distance vector” algorithms
Static or dynamic?Static: routes change slowly
over timeDynamic: routes change more
quickly periodic update in response to link
cost changes
CSCI 547 Network Layer 4-125
IGP(Intra AS) vs EGP(Inter AS)
A group of networks and routers under the authority of a single administration
AS: Autonomous System
To see AS numbers, visit here
CSCI 547 Network Layer 4-126
AS numbers from http://bgp.potaroo.net/cidr/autnums.html
AS3895 AMEDD-EUR - DoD Network Information Center AS3896 AMEDD-EUR - DoD Network Information Center AS3897 AMEDD-EUR - DoD Network Information Center AS3898 UCSF-HISD - University of Calif. S.F. - Hospital Info Sys AS3899 CHICO-NET - California State University, Chico AS3900 TEXASNET-ASN - Yokubaitis Holding Corporation AS3901 ARRAKIS - Higher Technology Services AS3902 GLAXOCA-1 - Glaxo Canada Inc. AS3903 NAG-AS - Network Ananlysis Group AS3904 ASTHOUGHTPRT - ThoughtPort inc.
CSCI 547 Network Layer 4-127
Popular Routing Algorithms
Distance vector
Link state Path vector
CSCI 547 Network Layer 4-128
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-129
Routing Algorithms
Interior Routing(within one AS) vs Exterior Routing (between AS's)
Current Routing AlgorithmsTwo common Routing Algorithms(Both are adaptive(dynamic) algorithms)
(a) "Link State Routing"-Each router shares its knowledge about its links to the neighbors(link state) with every routers in the area.-Example: OSPF***LINK STATES to EVERYBODY*** (b) "Distance Vector Routing"--Each router shares its knowledge(distance to destination networks) about the entire network(in an AS) with its neighbors-Example: RIPv2***EVERYTHING to NEIGHBORS ONLY***
CSCI 547 Network Layer 4-130
A Link-State Routing Algorithm
Dijkstra’s algorithm net topology, link costs
known to all nodes accomplished via “link
state broadcast” all nodes have same info
computes least cost paths from one node (‘source”) to all other nodes gives forwarding table
for that node iterative: after k iterations,
know least cost path to k dest.’s
Notation: c(x,y): link cost from
node x to y; = ∞ if not direct neighbors
D(v): current value of cost of path from source to dest. v
p(v): predecessor node along path from source to v
N': set of nodes whose least cost path definitively known
CSCI 547 Network Layer 4-131
Dijsktra’s Algorithm
1 Initialization: 2 N' = {u} 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(u,v) 6 else D(v) = ∞ 7 8 Loop 9 find w not in N' such that D(w) is a minimum 10 add w to N' 11 update D(v) for all v adjacent to w and not in N' : 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N'
CSCI 547 Network Layer 4-132
Dijkstra’s algorithm: example
Step012345
N'u
uxuxy
uxyvuxyvw
uxyvwz
D(v),p(v)2,u2,u2,u
D(w),p(w)5,u4,x3,y3,y
D(x),p(x)1,u
D(y),p(y)∞
2,x
D(z),p(z)∞ ∞
4,y4,y4,y
u
yx
wv
z2
2
13
1
1
2
53
5
CSCI 547 Network Layer 4-133
Dijkstra’s algorithm: example (2)
u
yx
wv
z
Resulting shortest-path tree from u:
vx
y
w
z
(u,v)(u,x)
(u,x)
(u,x)
(u,x)
destination link
Resulting forwarding table in u:
CSCI 547 Network Layer 4-134
Dijkstra’s algorithm, discussion
Algorithm complexity: n nodes each iteration: need to check all nodes, w, not in N n(n+1)/2 comparisons: O(n2) more efficient implementations possible: O(nlogn)
Oscillations possible: e.g., link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initially… recompute
routing… recompute … recompute
CSCI 547 Network Layer 4-135
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-136
Link State
Link State Routing--Example: OSPF(Open Shortest Path First) http://www.cisco.com/univercd/cc/td/doc/cisi twk/ito_doc/ospf.htm
OSPF is an IGP(Interior Gateway Protocol) developed by IETF Currently Version 2--RFC 1583--also has a version for IPv6— RFC 2740Open(spec. on public domain)OSPF has features not in RIP and is suitable for a large networkBased on Dijkstra's algorithmOSPF is an intra-AS protocol
CSCI 547 Network Layer 4-137
OSPF
Autonomous System: A group of networks and routers under the authority of a single administration
CSCI 547 Network Layer 4-138
-Autonomous System(A group of networks and routers under the authority of a single administration) is divided into Areas(A collection of networks, hosts, and routers all contained within an autonomous system). An AS is made up of several Areas.-3 kinds of routers in OSPF--(1) Intra-Area routers (2) Area Border routers (3) AS Boundary routers-Routers inside an Area flood the Area with routing information--information is exchanged to all routers in an Area by Flooding(actually multicasting)--all routers in an area maintain the same topology database-This area concept limits the size of the topology database that must be held by all routers in the area-At the border of an area, special routers called Area border routers collect and summarize the information about each area it is attached to and send the summary to other areas.
OSPF
CSCI 547 Network Layer 4-139
-One of the areas into which all areas have a connection is called Backbone(all OSPF networks must contain at least one area, the Backbone) which is assigned an area identifier of 0.0.0.0(meaning Area 0--not an IP address)-Routers inside a backbone are called Backbone routers--Backbone routers operate identically to other Intra-Area routers and maintain full topology databases for the backbone area-Autonomous Systems are connected via AS Boundary Routers--they exchange reachability information with routers in other ASs using an Exterior Gateway Protocol.-OSPF is "Link State Routing"-OSPF protocol allows the administrator to assign a cost(metric) to each route:
OSPF
CSCI 547 Network Layer 4-140
OSPF Cost--from http://www.cisco.com/warp/public/104/2.html#1.0 "The cost (also called metric) of an interface in OSPF is an indication of the overhead required to send packets across a certain interface. The cost of an interface is inversely proportional to the bandwidth of that interface. A higher bandwidth indicates a lower cost. There is more overhead (higher cost) and time delays involved in crossing a 56k serial line than crossing a 10M ethernet line. The formula used to calculate the cost is:cost= 10000 0000/bandwith in bps For example, it will cost 10 EXP8/10 EXP7 = 10 to cross a 10Mbps Ethernet line and will cost 10 EXP8/1544000 = 64 to cross a T1 line.By default, the cost of an interface is calculated based on the bandwidth; you can force the cost of an interface by using the ip ospf cost <value> interface sub-command."-The cost can also be based on a type of service(minimum delay, maximum throughput, …)
OSPF
CSCI 547 Network Layer 4-141
OSPF Cost calculation
CSCI 547 Network Layer 4-142
Distance vector
Link state Path vector
CSCI 547 Network Layer 4-143
Distance Vector: RIP
RIP(Rourting Information Protocol)- An IGP(Interior Gateway Protocol) developed by XEROX(XNS-Xerox Network Systems)- RIP widely adopted in Internet with 1982 4BSD UNIX even before the standard RFC 1058(1988)-later revised with RFC1388(RIP V.2)- Popular in small networks- Also popular for PC networking- AppleTalk's routing = version of RIP- Novell, 3COM, Banyan- RIP is a Distance Vector Routing---Hop Count is used as the metric--How many hops(each router is a hop) to the destination network?
CSCI 547 Network Layer 4-144
Operation- Each router initializes with a distance vector table containing zero(0) for itself, one(1) for directly attached networks, and infinity for every other destination.- Each router periodically (typically every 30 seconds) transmits its distance vector table to each of its neighbors. A router sends it's knowledge about the entire Autonomous system(=it's entire routing table(Netid, Hop Count entries)) to its neighboring routers( only to its neighbors). It can also transmit the table when a link first comes up or when the table changes.- Each router saves the most recent table it receives from each neighbor and uses the information to calculate its own distance vector table. (=Each router maintains the distance from itself to every known destination network in a "distance vector table")- Hop count is used as the metric--distance=how many hops
RIP
CSCI 547 Network Layer 4-145
RIP routing table
CSCI 547 Network Layer 4-146
RIP Updating AlgorithmReceive: A RIP message from a neighbor1. Add one hop to the hop count for each advertised destination2. Repeat the following steps for each advertised destination: 1. If(destination not in the routing table), then Add the advertised information to the table 2. Else 1. If(next-hop field is the same as the sender's address), then Replace entry in the table with the advertised one 2. Else 1. If(advertised hop count smaller than one in the table), then Add it to the routing table 2. Else do nothing3. Return
C
F
B
AE
CSCI 547 Network Layer 4-147
Distance Vector Algorithm
Bellman-Ford Equation (dynamic programming)
Definedx(y) := cost of least-cost path from x to y
Then
dx(y) = min {c(x,v) + dv(y) }
where min is taken over all neighbors v of x
v
CSCI 547 Network Layer 4-148
Bellman-Ford example
u
yx
wv
z2
2
13
1
1
2
53
5Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
du(z) = min { c(u,v) + dv(z), c(u,x) + dx(z), c(u,w) + dw(z) } = min {2 + 5, 1 + 3, 5 + 3} = 4
Node that achieves minimum is nexthop in shortest path ➜ forwarding table
B-F equation says:
CSCI 547 Network Layer 4-149
Distance Vector Algorithm
Dx(y) = estimate of least cost from x to y Node x knows cost to each neighbor v: c(x,v) Node x maintains distance vector Dx =
[Dx(y): y є N ] Node x also maintains its neighbors’
distance vectors For each neighbor v, x maintains
Dv = [Dv(y): y є N ]
CSCI 547 Network Layer 4-150
Distance vector algorithm (4)
Basic idea: Each node periodically sends its own
distance vector estimate to neighbors When a node x receives new DV estimate
from neighbor, it updates its own DV using B-F equation:
Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N
Under minor, natural conditions, the estimate Dx(y) converge to the actual least cost dx(y)
CSCI 547 Network Layer 4-151
Distance Vector Algorithm (5)
Iterative, asynchronous: each local iteration caused by:
local link cost change DV update message from
neighbor
Distributed: each node notifies
neighbors only when its DV changes neighbors then notify
their neighbors if necessary
wait for (change in local link cost or msg from neighbor)
recompute estimates
if DV to any dest has
changed, notify neighbors
Each node:
CSCI 547 Network Layer 4-152
x y z
xyz
0 2 7
∞ ∞ ∞∞ ∞ ∞
from
cost to
from
from
x y z
xyz
0
from
cost to
x y z
xyz
∞ ∞
∞ ∞ ∞
cost to
x y z
xyz
∞ ∞ ∞7 1 0
cost to
∞2 0 1
∞ ∞ ∞
2 0 17 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} = min{2+0 , 7+1} = 2
Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)} = min{2+1 , 7+0} = 3
32
CSCI 547 Network Layer 4-153
x y z
xyz
0 2 7
∞ ∞ ∞∞ ∞ ∞
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
∞ ∞
∞ ∞ ∞
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
∞ ∞ ∞7 1 0
cost to
∞2 0 1
∞ ∞ ∞
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} = min{2+0 , 7+1} = 2
Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)} = min{2+1 , 7+0} = 3
CSCI 547 Network Layer 4-154
Advantages of RIP-----Simplicity=low overhead
Disadvantages of RIP:(a) The limit to the size of a network imposed by maximum hop
count(15).
(b) Slow Convergence problemChanges in routes propagates slowly.For example, suppose there is a change in local LAN to which a router is connected. The router updates itself immediately. Then it will advertise the change in 30 seconds, so the next neighboring routers will know, then they will advertise within 30 seconds, ……Assuming a remote router is n hops away, then it could take 30*n seconds. This is why RIP limits the maximum hop count to 15. RIP is designed for small networks
(c) Instability problem(Also called "count to infinity problem")Counting to infinity occurs when a network becomes unreachable, but erroneous routes to that network persist because of the time for the distance vector tables to converge.
CSCI 547 Network Layer 4-155
Distance Vector: link cost changes
To solve the Slow Convergence problem, whe link cost changes notify neighbors ASAP:
node detects local link cost change updates routing info, recalculates distance
vector if DV changes, notify neighbors immediately
“goodnews travelsfast”
x z14
50
y1
At time t0, y detects the link-cost change, updates its DV, and informs its neighbors.
At time t1, z receives the update from y and updates its table. It computes a new least cost to x and sends its neighbors its DV.
At time t2, y receives z’s update and updates its distance table. y’s least costs do not change and hence y does not send any message to z.
But this does not solve Instability problem
CSCI 547 Network Layer 4-156
Instability problem for RIP(Also called as "count to infinity problem")
Net1 goes down
When B sends its table, A thinks that B can deliver to Net1
When A sends its table, B updates the hop count( thru A ! )
1
2
3
B finally learns that Net1 is down hop count 16 means the net is down
CSCI 547 Network Layer 4-157
Comparison of LS and DV algorithms
Message complexity LS: with n nodes, E links,
O(nE) msgs sent DV: exchange between
neighbors only convergence time varies
Speed of Convergence LS: O(n2) algorithm requires
O(nE) msgs may have oscillations
DV: convergence time varies may be routing loops count-to-infinity problem
Robustness: what happens if router malfunctions?
LS: node can advertise
incorrect link cost each node computes only
its own table
DV: DV node can advertise
incorrect path cost each node’s table used by
others • error propagate thru
network
CSCI 547 Network Layer 4-158
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-159
Hierarchical Routing
scale: with 200 million destinations:
can’t store all dest’s in routing tables!
routing table exchange would swamp links!
administrative autonomy internet = network of
networks each network admin may
want to control routing in its own network
Our routing study thus far - idealization all routers identical network “flat”… not true in practice
CSCI 547 Network Layer 4-160
Hierarchical Routing
aggregate routers into regions, “autonomous systems” (AS)
routers in same AS run same routing protocol “intra-AS” routing
protocol routers in different AS
can run different intra-AS routing protocol
Gateway router Direct link to router in
another AS
CSCI 547 Network Layer 4-161
3b
1d
3a
1c2aAS3
AS1
AS21a
2c2b
1b
Intra-ASRouting algorithm
Inter-ASRouting algorithm
Forwardingtable
3c
Interconnected ASes
Forwarding table is configured by both intra- and inter-AS routing algorithm Intra-AS sets entries
for internal dests Inter-AS & Intra-AS
sets entries for external dests
CSCI 547 Network Layer 4-162
AS numbers from http://bgp.potaroo.net/cidr/autnums.html
AS3895 AMEDD-EUR - DoD Network Information Center AS3896 AMEDD-EUR - DoD Network Information Center AS3897 AMEDD-EUR - DoD Network Information Center AS3898 UCSF-HISD - University of Calif. S.F. - Hospital Info Sys AS3899 CHICO-NET - California State University, Chico AS3900 TEXASNET-ASN - Yokubaitis Holding Corporation AS3901 ARRAKIS - Higher Technology Services AS3902 GLAXOCA-1 - Glaxo Canada Inc. AS3903 NAG-AS - Network Ananlysis Group AS3904 ASTHOUGHTPRT - ThoughtPort inc.
CSCI 547 Network Layer 4-163
3b
1d
3a
1c2aAS3
AS1
AS21a
2c2b
1b
3c
Inter-AS tasks
Suppose a router in AS1 receives datagram for which dest is outside of AS1 Router should forward
packet towards one of the gateway routers, but which one?
AS1 needs:1. to learn which dests
are reachable through AS2 and which through AS3
2. to propagate this reachability info to all routers in AS1
Job of inter-AS routing!Gateway routers
CSCI 547 Network Layer 4-164
Example: Setting forwarding table in router 1d
Suppose AS1 learns (via inter-AS protocol) that subnet x is reachable via AS3 (gateway 1c) but not via AS2.
Inter-AS protocol propagates reachability info to all internal routers.
Router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c.
Puts in forwarding table entry (x,I).
3b
1d
3a
1c2aAS3
AS1AS2
1a
2c2b
1b
3c
CSCI 547 Network Layer 4-165
Example: Choosing among multiple ASes
Now suppose AS1 learns from the inter-AS protocol that subnet x is reachable via AS3 and via AS2.
To configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x.
This is also the job on inter-AS routing protocol!
3b
1d
3a
1c2aAS3
AS1
AS21a
2c2b
1b
3c
CSCI 547 Network Layer 4-166
Learn from inter-AS protocol that subnet x is reachable via multiple gateways
Use routing infofrom intra-AS
protocol to determine
costs of least-cost paths to each
of the gateways
Hot potato routing:Choose the
gatewaythat has the
smallest least cost
Determine fromforwarding table the interface I that leads
to least-cost gateway. Enter (x,I) in
forwarding table
Example: Choosing among multiple ASes
Now suppose AS1 learns from the inter-AS protocol that subnet x is reachable from AS3 and from AS2.
To configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x.
This is also the job on inter-AS routing protocol! Hot potato routing: send packet towards closest
of two routers.
CSCI 547 Network Layer 4-167
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-168
Intra-AS Routing
Also known as Interior Gateway Protocols (IGP) Most common Intra-AS routing protocols:
RIP: Routing Information Protocol
OSPF: Open Shortest Path First
IGRP: Interior Gateway Routing Protocol (Cisco proprietary—modified version of RIP)
CSCI 547 Network Layer 4-169
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-170
RIP ( Routing Information Protocol)
Distance vector algorithm Included in BSD-UNIX Distribution in 1982 Distance metric: # of hops (max = 15 hops)
DC
BA
u v
w
x
yz
destination hops u 1 v 2 w 2 x 3 y 3 z 2
From router A to subsets:
CSCI 547 Network Layer 4-171
RIP advertisements
Distance vectors: exchanged among neighbors every 30 sec via Response Message (also called advertisement)
Each advertisement: list of up to 25 destination nets within AS
CSCI 547 Network Layer 4-172
RIP: Example
Destination Network Next Router Num. of hops to dest. w A 2
y B 2 z B 7
x -- 1…. …. ....
w x y
z
A
C
D B
Routing table in D
CSCI 547 Network Layer 4-173
RIP: Example
Destination Network Next Router Num. of hops to dest. w A 2
y B 2 z B A 7 5
x -- 1…. …. ....Routing table in D
w x y
z
A
C
D B
Dest Next hops w - 1 x - 1 z C 4 …. … ...
Advertisementfrom A to D
CSCI 547 Network Layer 4-174
RIP: Link Failure and Recovery
If no advertisement heard after 180 sec --> neighbor/link declared dead routes via neighbor invalidated new advertisements sent to neighbors neighbors in turn send out new advertisements
(if tables changed) link failure info quickly (?) propagates to entire
net poison reverse used to prevent ping-pong
loops (infinite distance = 16 hops)
CSCI 547 Network Layer 4-175
RIP Table processing
RIP routing tables managed by application-level process called route-d (daemon)
advertisements sent in UDP packets, periodically repeated
physical
link
network forwarding (IP) table
Transprt (UDP)
routed
physical
link
network (IP)
Transprt (UDP)
routed
forwardingtable
CSCI 547 Network Layer 4-176
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-177
OSPF (Open Shortest Path First)
“open”: publicly available Uses Link State algorithm
LS packet dissemination Topology map at each node Route computation using Dijkstra’s algorithm
OSPF advertisement carries one entry for each connection to neighbor router
Advertisements disseminated to entire AS (via flooding) Carried in OSPF messages directly over IP (rather than
TCP or UDP) via IP protocol number 89.
CSCI 547 Network Layer 4-178
OSPF “advanced” features (not in RIP)
Security: all OSPF messages authenticated (to prevent malicious intrusion) using MD5 (Message-
Digest algorithm 5) Multiple same-cost paths allowed (only one path
in RIP) For each link, multiple cost metrics for different
TOS (e.g., satellite link cost set “low” for best effort; high for real time)
Integrated uni- and multicast support: Multicast OSPF (MOSPF) uses same topology
data base as OSPF Hierarchical OSPF in large domains.
CSCI 547 Network Layer 4-179
Hierarchical OSPF
CSCI 547 Network Layer 4-180
Hierarchical OSPF
Two-level hierarchy: local area, backbone. Link-state advertisements only in area each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas. Area border routers: “summarize” distances to
nets in own area, advertise to other Area Border routers.
Backbone routers: run OSPF routing limited to backbone.
Boundary routers: connect to other AS’s.
CSCI 547 Network Layer 4-181
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-182
Internet inter-AS routing: BGP
BGP (Border Gateway Protocol): the de facto (Latin: “In practice) standard
BGP provides each AS a means to:1. Obtain subnet reachability information from
neighboring ASs.2. Propagate reachability information to all AS-
internal routers.3. Determine “good” routes to subnets based on
reachability information and policy. allows subnet to advertise its existence to
rest of Internet: “I am here”
CSCI 547 Network Layer 4-183
BGP basics Pairs of routers (BGP peers) exchange routing info
over semi-permanent TCP connections: BGP sessions BGP sessions need not correspond to physical links.
When AS2 advertises a prefix to AS1, AS2 is promising it will forward any datagrams destined to that prefix towards the prefix. AS2 can aggregate prefixes in its advertisement
3b
1d
3a
1c2aAS3
AS1
AS21a
2c
2b
1b
3c
eBGP session
iBGP session
CSCI 547 Network Layer 4-184
Distributing reachability info With eBGP session between 3a and 1c, AS3 sends prefix
reachability info to AS1. 1c can then use iBGP to distribute this new prefix
reachability info to all routers in AS1 1b can then re-advertise new reachability info to AS2
over 1b-to-2a eBGP session When router learns of new prefix, creates entry for prefix
in its forwarding table.
3b
1d
3a
1c2aAS3
AS1
AS21a
2c
2b
1b
3c
eBGP session
iBGP session
CSCI 547 Network Layer 4-185
Path attributes & BGP routes
When advertising a prefix, includes BGP attributes. prefix + attributes = “route”
Two important attributes: AS-PATH: contains ASs through which prefix
advertisement has passed: AS 67 AS 17 –- “path vector” algorithm
NEXT-HOP: Indicates specific internal-AS router to next-hop AS. (There may be multiple links from current AS to next-hop-AS.)
When gateway router receives route advertisement, uses import policy to accept/decline.
CSCI 547 Network Layer 4-186
BGP route selection
Router may learn about more than 1 route to some prefix. Router must select route.
Elimination rules:1. Local preference value attribute: policy decision2. Shortest AS-PATH 3. Closest NEXT-HOP router: hot potato routing4. Additional criteria
CSCI 547 Network Layer 4-187
BGP messages
BGP messages exchanged using TCP. BGP messages:
OPEN: opens TCP connection to peer and authenticates sender
UPDATE: advertises new path (or withdraws old)
KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request
NOTIFICATION: reports errors in previous msg; also used to close connection
CSCI 547 Network Layer 4-188
BGP routing policy
Figure 4.5-BGPnew: a simple BGP scenario
A
B
C
W X
Y
legend:
customer network:
provider network
A,B,C are provider networks X,W,Y are customer (of provider networks) X is dual-homed: attached to two networks
X does not want to route from B via X to C .. so X will not advertise to B a route to C
CSCI 547 Network Layer 4-189
BGP routing policy (2)
Figure 4.5-BGPnew: a simple BGP scenario
A
B
C
W X
Y
legend:
customer network:
provider network
A advertises to B the path AW B advertises to X the path BAW Should B advertise to C the path BAW?
No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers
B wants to force C to route to w via A B wants to route only to/from its customers!
CSCI 547 Network Layer 4-190
Why different Intra- and Inter-AS routing ?
Policy: Inter-AS: admin wants control over how its traffic
routed, who routes through its net. Intra-AS: single admin, so no policy decisions
needed
Scale: hierarchical routing saves table size, reduced
update trafficPerformance: Intra-AS: can focus on performance Inter-AS: policy may dominate over performance
CSCI 547 Network Layer 4-191
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
CSCI 547 Network Layer 4-192
R1
R2
R3 R4
sourceduplication
R1
R2
R3 R4
in-networkduplication
duplicatecreation/transmissionduplicate
duplicate
Broadcast Routing
Deliver packets from source to all other nodes Source duplication is inefficient:
Source duplication: how does source determine recipient addresses?
CSCI 547 Network Layer 4-193
In-network duplication
Flooding: when node receives brdcst pckt, sends copy to all neighbors Problems: cycles & broadcast storm
Controlled flooding: node only brdcsts pkt if it hasn’t brdcst same packet before Node keeps track of pckt ids already brdcsted Or reverse path forwarding (RPF): only forward
pckt if it arrived on shortest path between node and source
Spanning tree No redundant packets received by any node
CSCI 547 Network Layer 4-194
A
B
G
DE
c
F
A
B
G
DE
c
F
(a) Broadcast initiated at A (b) Broadcast initiated at D
Spanning Tree
First construct a spanning tree Nodes forward copies only along spanning
tree
CSCI 547 Network Layer 4-195
A
B
G
DE
c
F1
2
3
4
5
(a) Stepwise construction of spanning tree
A
B
G
DE
c
F
(b) Constructed spanning tree
Spanning Tree: Creation—one of many algorithms
Center node is predefined—node E in this example Each node sends unicast join message to center
node Message forwarded until it arrives at a node already
belonging to spanning tree Once the spanning tree is made, any node can broadcast
CSCI 547 Network Layer 4-196
Modes of message transfer
anycast
broadcast
multicast
unicast
Added in IPv6
CSCI 547 Network Layer 4-197
Multicasting on Internet
IP Multicast Internet Relay Chat NNTP Network News Transfer Protocol PSYC Protocol for SYnchronous Conferencing WWCP World Wide Conferencing Network XCAST Explicit Multi-Unicast Streaming mediaInteractive games
Send to group members only
CSCI 547 Network Layer 4-198
Two main problems in multicasting
How to identify group members?What address should be used?Class Start End Default Subnet
MaskCIDR prefix notation
A 1.0.0.0 127. 255.255.255 255.0.0.0 /8
B 128.0.0.0 191.255. 255.255 255.255.0.0 /16
C 192.0.0.0 223.255.255.255 255.255.255.0 /24
D 224.0.0.0 239.255.255.255 NA NA
E 240.0.0.0 255.255.255.255 NA NA
CSCI 547 Network Layer 4-199
mcast
CSCI 547 Network Layer 4-200
Multicast Routing: sending to a Group
Goal: find a tree (or trees) connecting routers having local mcast group members (Tree is ideal for this: not all paths between routers used)
Two approaches: source-based: different tree from each sender to rcvrs shared-tree: same tree used by all group members
Shared tree Source-based trees
CSCI 547 Network Layer 4-201
Approaches for building mcast trees
Approaches: source-based tree: one tree per source—
same as spanning-tree algorithm shortest path trees reverse path forwarding
group-shared tree: group uses one tree minimal spanning (Steiner) center-based trees
…we first look at basic approaches, then specific protocols adopting these approaches
CSCI 547 Network Layer 4-202
Shortest Path Tree
mcast forwarding tree: tree of shortest path routes from source to all receivers Dijkstra’s algorithm
R1
R2
R3
R4
R5
R6 R7
21
6
3 4
5
i
router with attachedgroup member
router with no attachedgroup member
link used for forwarding,i indicates order linkadded by algorithm
LEGENDS: source
CSCI 547 Network Layer 4-203
Reverse Path Forwarding
if (mcast datagram received on incoming link on shortest path back to center)
then flood datagram onto all outgoing links else ignore datagram
rely on router’s knowledge of unicast shortest path from it to sender
each router has simple forwarding behavior:
CSCI 547 Network Layer 4-204
Reverse Path Forwarding: example
• result is a source-specific reverse SPT– may be a bad choice with asymmetric links
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup member
datagram will be forwarded
LEGENDS: source
datagram will not be forwarded
CSCI 547 Network Layer 4-205
Reverse Path Forwarding: pruning
forwarding tree contains subtrees with no mcast group members no need to forward datagrams down subtree “prune” msgs sent upstream by router with no
downstream group members
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup member
prune message
LEGENDS: source
links with multicastforwarding
P
P
P
CSCI 547 Network Layer 4-206
Shared-Tree: Steiner Tree
Steiner Tree: minimum cost tree connecting all routers with attached group members
problem is NP-complete("non-deterministic polynomial
time") excellent heuristics exists not used in practice:
computational complexity information about entire network needed monolithic: rerun whenever a router needs to
join/leave
CSCI 547 Network Layer 4-207
Center-based trees
single delivery tree shared by all one router identified as “center” of tree to join:
edge router sends unicast join-msg addressed to center router
join-msg “processed” by intermediate routers and forwarded towards center
join-msg either hits existing tree branch for this center, or arrives at center
path taken by join-msg becomes new branch of tree for this router
CSCI 547 Network Layer 4-208
Center-based trees: an example
Suppose R6 chosen as center:
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup member
path order in which join messages generated
LEGEND
21
3
1
CSCI 547 Network Layer 4-209
Internet Multicasting Routing: DVMRP
DVMRP: Distance Vector Multicast Routing Protocol, RFC1075
flood and prune: reverse path forwarding, source-based tree RPF tree based on DVMRP’s own routing tables
constructed by communicating DVMRP routers no assumptions about underlying unicast initial datagram to mcast group flooded
everywhere via RPF routers not wanting group: send upstream prune
msgs
CSCI 547 Network Layer 4-210
DVMRP: continued…
soft state: DVMRP router periodically (1 min.) “forgets” branches are pruned: mcast data again flows down unpruned branch downstream router: reprune or else continue
to receive data routers can quickly regraft to tree
following IGMP join at leaf odds and ends
commonly implemented in commercial routers Mbone routing done using DVMRP
CSCI 547 Network Layer 4-211
Tunneling
Q: How to connect “islands” of multicast routers in a “sea” of unicast routers?
mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram
normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router
receiving mcast router unencapsulates to get mcast datagram
physical topology logical topology
CSCI 547 Network Layer 4-212
PIM: Protocol Independent Multicast
PIM is a family of multicast routing protocol
not dependent on any specific underlying unicast routing algorithm (works with all)
two different multicast distribution scenarios :
Dense: group members
densely packed, in “close” proximity.
bandwidth more plentiful
Sparse: # networks with group
members small wrt # interconnected networks
group members “widely dispersed”
bandwidth not plentiful
CSCI 547 Network Layer 4-213
Consequences of Sparse-Dense Dichotomy:
Dense group membership by
routers assumed until routers explicitly prune
data-driven construction on mcast tree (e.g., RPF)
bandwidth and non-group-router processing profligate
Sparse: no membership until
routers explicitly join receiver- driven
construction of mcast tree (e.g., center-based)
bandwidth and non-group-router processing conservative
CSCI 547 Network Layer 4-214
PIM- Dense Mode
flood-and-prune RPF, similar to DVMRP but
underlying unicast protocol provides RPF info for incoming datagram
less complicated (less efficient) downstream flood than DVMRP reduces reliance on underlying routing algorithm
has protocol mechanism for router to detect it is a leaf-node router
CSCI 547 Network Layer 4-215
PIM - Sparse Mode center-based approach router sends join msg
to rendezvous point (RP) intermediate routers
update state and forward join
after joining via RP, router can switch to source-specific tree increased performance:
less concentration, shorter paths
R1
R2
R3
R4
R5
R6R7
join
join
join
all data multicastfrom rendezvouspoint
rendezvouspoint
CSCI 547 Network Layer 4-216
PIM - Sparse Mode
sender(s): unicast data to RP,
which distributes down RP-rooted tree
RP can extend mcast tree upstream to source
RP can send stop msg if no attached receivers “no one is listening!”
R1
R2
R3
R4
R5
R6R7
join
join
join
all data multicastfrom rendezvouspoint
rendezvouspoint
CSCI 547 Network Layer 4-217
Chapter 4: summary 4. 1 Introduction 4.2 Virtual circuit and
datagram networks 4.3 What’s inside a
router 4.4 IP: Internet Protocol
Datagram format IPv4 addressing ICMP IPv6
4.5 Routing algorithms Link state Distance Vector Hierarchical routing
4.6 Routing in the Internet RIP OSPF BGP
4.7 Broadcast and multicast routing
For more info. for routing, visit here