View
2
Download
0
Category
Preview:
Citation preview
Routers and routing
Olof Hagsand KTHNOC/NADA
DD2490 p4 2008
What is a router?
• Host (endsystem) •One or many network interfaces•Cannot forward packets between them• Router•Can forward packets between multiple interfaces•Forwarding on Layer 3
Connecting devices
Connectingdevices
Networkingdevices
Internetworkingdevices
Applicationgateway
RouterBridge/Switch
Hub/Repeater
L1 L2 L3 L4-L7
IEEE 802 vs IPv4 addresses
1011110110111101
Group/Individual bit
Global/Localbit
1011110101110101 1011110110111101 1011110101110101 1011110110111101 1011110101110101
vendor code vendor assigned
IEEE802
IPv4 addr1011110111000000 1011110100100100 1011110101111101 1011110100010010
netid hostid
192.36.125.18
00:0E:35:64:E9:E7
Routing vs bridging
● Bridging forwarding on layer 2– A MAC address/ID has a flat structure
● many nodes > large forwarding tables● broadcast reaches all nodes
– Simple to configure and manage, cheaper
– Loops detected by spanning tree protocol
● Routing – forwarding on layer 3– The netid of the IP addresses can be aggregated
● many nodes > smaller forwarding tables than bridging● routers partition broadcast domains
– Routing is more difficult to configure
– Loops detected by routing protocols and TTL decrementation
What does a router do?
• Packet forwarding• Not only IPv4:• IPv6, MPLS, Bridging/VLAN, Tunneling,...• Filter packets•Access lists• Metering/Shaping• Compute routes: build forwarding table• In the background: routing• In realtime: forwarding
Router Components
CPU
RoutingTable
Memory
Line cards
External links
CPU module”Control Processor””Routing Engine”
MAC
Memory
PacketProcessing
MAC
Memory
PacketProcessing
Interconnect
MAC
Memory
PacketProcessing
Examine headers, routing decision...
Input buffering,waiting for access to
output port...
Output buffering, waiting for transmission...
QoS scheduling...
Execute routing protocols,compute routing table,configure line cards...
CPU
RoutingTable
Memory
Fast path, slow path
● Fast path
– If line cards can determine outgoing port● Slow path
– Control processor must determine outgoing port
Control Processor
LineCard
LineCard
LineCard
LineCard
Fast path
Slow path
Inside a router, 1st Generation
● Late 80s, early 90s● Every packet goes twice over the shared bus
● Capacity < 0.5 Gb/s
LineCard
LineCard
LineCard
BufferMemory
CPU RIB
Shared bus backplane
Line Card
BufferMemory
forwarder
Line Card
BufferMemory
forwarder
Line Card
BufferMemory
forwarder
Line Card
BufferMemory
forwarder
Inside a router, 3rd Generation
● Late 90s● Multiple simultaneous transfers over the backplane● Specialized hardware: ASICs (Application Specific IC)● Capacity 100 Gb/s
CPU
RIB
CPU Card
Switched backplane
Crossbar Architecture
● Space division approach
● Switched interconnection between input and output
● Centralized controller – coordinates inputoutput
ports
– activates paths between ports
● Multiple transfers can proceed simultaneously
● Crossbar is nonblocking
switching fabric
input ports
output ports
1
2
N
.
.
.
1 2 M
controller
. . .
interface logic
Shared Bus Architecture
● Relies on time division – internal data path is shared● Address, control, and data lines and a bus protocol● Granularity
– Packet granularity: simple, but may result in delay problems
– Block granularity: more overhead, but avoids long delays
output ports
1 2 M. . .
input ports
1 2 N. . .
shared bus
Routing table lookup ● Longest prefix first● Divide table in 32 ”buckets” one for each netmask length
● Match destination with longest prefixes first● SW algorithms: tree, binary trees, tries (different data structures)● HW support: TCAMs – Content Addressable Memory
Netid
Netid...
0
1
32
31
Masklen
destination IP address
Linear Search on Length Using a Trie
● Binary tree– Nodes are prefixes
– Left branch represents ´0´in the string
– Right branch represents ´1´
e
011*
f g
c01*
0*
a*
10*
110* d
b
1*
0010 0110 0111
a *
b 10*
c 01*
d 110*
e 0010
f 0110
g 0111
00*
000*
11*
Elimination of Internal Prefixes
● No overlapping prefixes● Prefix expansion with ”leaf pushing”
● Simplifies lookup at expense of larger memory
a a e a
00*
c c f g
01*
*
b b b b
10*
d d a a
11*
a *
b 10*
c 01*
d 110*
e 0010
f 0110
g 0111
TCAM
Linear Search on Values—TCAM
● Ternary ContentAddressable Memory– Fully associative memory
● Three values for each bit—’0’, ’1’, and ’x’ (don’t care)
● Compare input with all words in parallel– First match gives the result
● Up to 100 million searches per second
a *
b 10*
c 01*
d 110*
e 0010
f 0110
g 0111
0010 g
f
e
d
c
b
a
0110
0111
110x
01xx
10xx
xxxx
input
=
=
=
=
=
=
=
24bit prefixes
Prefix Search in TCAM
● Route lookup in one memory access
● Prefixes ordered by length32bit prefixes
31bit prefixes
8bit prefixes
Packet classification
● Map a packet to a class● Class defined by filters, usually a 5tuple:
– <source IP, destination IP, source port, destination port, protocol>
● For example, all packets:– From subnet N– To TCP port 80 on webserver S– From subnet N to port 666 on subnet M
● Applications:– Firewall & NAT
– Blocking
– Accounting
– Policy routing
– QoS—metering, policing, DiffServ marking, ...
Cisco 12816
Port density examples● 30xOC192 (10 Gb/s) ports
● 120xOC48 (2.5 Gb/s) ports● 15x10 Gigabit Ethernet ports
● 60x1 Gigabit Ethernet ports
6ft
19”
2ft
Capacity: 1.28 Tb/sPower: 4.7 kW
Cisco CRS1
CISCO's current flagship:Carrier Routing System
3stage multistage switching plane
>50% of cost
Trie prefix lookup
7.5kW
Each slot has 40Gbps
32Tbps raw bandwidth
Distributed RP
Several Logical Routers
Optical_Electric transitions:OEOEOEO
Juniper Routers
● Mseries– Shipping started 1998– M5, M10, M20, M40e, M160,
M320– 8xOC192 or 32xOC48
ports in a M160
● Tseries– Shipping started 2002– T320, T640– 32xOC192 or 128xOC48
ports in a T640
Juniper M160
3ft
2.5ft
19”
Capacity: 80Gb/sPower: 2.6kW
Juniper Jseries
● Jseries– Software PCbased– Emulates M/T series– Full software– Ideal for research and education
Open source routing
● Linux, BSD platforms● Most routing protocols exist as open source projects● But PC hw has traditionally been a limiting factor● But now dual/quad core CPUs, new buses (PCI express),
10Gbps NICs enables gigabit forwarding speeds.● Example: the Bifrost open
source router (UU/KTH)
Routing
Levels of abstraction
The Internet is huge– Necessary to divide the routing problem into subproblems.
– There are several layers of abstractions
The Internet is partitioned into Autonomous systems (AS)– An independent administrative domain
– Routing between AS:s is called interdomain routing / External routing
– Based on commercial agreements – Policies, Servicelevel agreements
Routing within an AS– Routing inside an AS: Intradomain routing / Internal routing
– Best path based on hop/bw metrics
Autonomous systems RFC1930
An Autonomous system is generally administered by a single entity.
Operators, ISPs (Internet Service Providers)
An AS contains an arbitrary complex substructure.
Each autonomous system selects the routing protocol to be used within the AS.
Policies or updates within an AS are not propagated to other AS:s.
An ASnumber is (currently) a 16bit unique identifier
Interconnection between AS:s
– Service Level Agreements (SLA:s)
– Internet Exchange Points (IX:s)/ Network Access Points (NAPs)
– Direct connections
Internet structure● Ideally, there is a welldefined hierarchy in the Internet – a tree.
1 A few large “Tier 1” backbone providers – the core of the Internet (Sprint, Level3, Telstra, ...)
● Provides transit for everyone else
2 Tier 2 regional ISPs, or NSPs (Network Service Providers)
3 Smaller ISPs
4 Customers
● A welldefined hierarchy is nice for address aggregation –> smaller IP tables
● However, the hierarchy has broken down due to market forces:– Peering at IXs, direct connections.
● The Internet structure is now more in the form of a graph > larger routing tables
AS graph and peering relations
AS2
AS4
AS1
AS3
AS8AS7AS6 AS9
AS5
Transit
Peer
Customer
Tier 1: FullInternetconnectivity
NSPsISPs
Stubs/Customers
IGP/EGP
EGP– Exterior Gateway Protocol.
– Runs between networks/domains (interdomain)
– Examples: BGP, static routing
IGP– Interior Gateway Protocol.
– Runs within a network/domain (intradomain)
– Examples: RIP, OSPF, ISIS.
Customer
IGP
ISP
IGP
EGP
Internal vs external routing
● The Internet is huge– Necessary to divide the routing problem into subproblems.– The “Internet” is divided into Autonomous systems (ASs)– Each AS is independently managed
● Interdomain routing / External routing– Routing between AS:s
– Based on commercial agreements – Policies, Servicelevelagreements● Intradomain routing / Internal routing
– Routing inside an AS– An AS may be further divided into areas– Best path based on hop metric
● Static vs Dynamic routing
Static vs dynamic routing
● Static routing– Manually configure routing table
– Typically for small networks
– Singlehomed, default route
– Hosts are (almost) always statically routing
● Dynamic routing– As soon as the network is nontrivial, it is too difficult to manually
configure a network (see lab1)
– Need dynamic routing protocol
– Only routers participate in dynamic routing
The routing table● Currently, backbone IP tables are around 250000 entries.
– The RIB may be much larger● Virtual private networks (many customer routing tables) the
tables are even larger● Also, a “routing table” is actually many datastructures:
– Many different protocols
– Forwarding information base (FIBs)
– Routing information base (RIBs)
Announced networks
From Geoff Huston , 2008http://www.cidrreport.org
Load balancing
● The routing protocol gives several routes to a network● Either select the best● Or loadbalance between several links
– Unequalcost multipath– Equalcost multipath
● The forwarding decides how to balance actual traffic:– random (but this break TCP flows)– load balance per flow– load balance per address pairs
Example: load-balancing
● IS-IS/OSPF load balancing with two 3ms paths, one slow 20 ms path.
● Hosts from the same LAN (or different flows from same host) may take different routes.
3 ms
3 ms
20 ms
Asymmetric Routing
● A rule rather than an exception:– To- traffic and from- traffic take different paths
● Hot-potato routing– Send traffic out of your AS as soon as possible
● Cold-potato– Try to keep your traffic as long as possible.
Aggregation
● Also called summarization● The netid part of IPv4 addresses can be aggregated (summarized) into
shorter prefixes.● Summarization is often done manually● Leads to smaller routing tables (fewer prefixes)● Threats: multihoming and loadbalancing
199.1.2.0/24
199.1.1.0/24
199.1.0.0/24199.1.3.0/24
199.1.4.0/24
Metrics
● A fundamental functionality in a dynamic routing protocol:– Find the ”best path” to a destination
● But what is best path?– Interior routing: typically number of hops, or bandwidth– Exterior routing: business relations – peering
● Metrics– Number of “hops” (most common)– Bandwidth, Delay, Cost, Load, ”Policies”
Routing algorithms
● How does a router find a best path?● Most solutions based on SPF (Shortest Path First) algorithms that
are well known in graph theory.– BellmanFord
– Dijkstra
● Apart from that, there are also other algorithms in– Multicast routing
– Adhoc routing
● Sensor networks– Delaytolerant networks
Routing protocol classes
● Almost all unicast routing protocols can be classified into one of two groups:– LinkState protocols (OSPF, ISIS)
– DistanceVector protocols (RIP, IGRP, BGP)
● They are also classified into – Exterior (Interdomain) routing protocols
● Between autonomous systems– Interior (Intradomain) routing protocols
● Within an autonomous system
Popular Unicast Routing Protocols
Routing Protocols
Interior Exterior
BGPRIP OSPF IS-ISIGRP(cisco)
EGP
Routes may come from many “protocols”
● Direct– Networks on directly connected interfaces
● Local– example: 127.0.0.1
● Static– Configured static routes
● Aggregate– Manually aggregated routes
● RIP, OSPF, ISIS, BGP, RSVP,...
Route preference / Administrative distance
● Several protocols may include the same prefix. How do you decide which route to install in your routing table?
● Default preference (on Juniper) is:– Direct > Local > Static > OSPF > ISIS > RIP > Aggregate >
BGP
● Can be changed or overridden with policies
Redistribution of routing information
● If several protocols are running on the same router– E.g., an OSPF as interior and BGP as exterior– E.g. static routes into dynamic routing protocol
● The router can distribute routes from one protocol to another– Interior routes need to be advertized to the Internet
● Typically these routes are aggregated– Exterior routes may need to be injected into the interior network
● But only a subset – the backbone tables are very large● Necessary for domain carrying transit traffic● Not necessary for a domain using only a default route
● Typically, redistributed routes are filtered in different ways due to routing policies
The routing process
FIB
Routing Information Base
Forwarding Information Base
RoutingProcess
RIB RIB RIB
Routing protocol 1 Routing protocol 2
Linecards
CPU
Routing protocol 3
FIB FIB
Routing instances and tables
inet.0
RIB
Routing Instance: main RIBs
Routing protocol 3
Routing Instance: other RIBs
inet6.0
inet.1
inet.2
inet.3
mpls.0
IPv4 unicast routes
IPv6 unicast routes
IPv4 multicast forwarding cache
IPv4 multicast RPF table
IPv4 routes learnt from MPLS-TE path exploration
MPLS label-switch table
inet.0
Example: main.inet.0 __juniper_private1__.inet.0
Logical routers, VPNs, virtual routers, etc, use routing instances.
inet.4 MSDP routes
Routing policiesNeighbours
Protocols
Neighbours
Protocols
RIB
FIB
ExportImport
Note: Export policies may be applied only to active routes!
Protocol Default import action Default export action
direct and static accept all N/A
RIP accept all RIP routes reject all
BGP accept all BGP routes export all active BGP routes
IS-IS accept all IS-IS routes reject all (IS-IS uses LSAs)
OSPF accept all OSPF routes reject all (OSPF uses LSAs)
MPLS accept all MPLS routes export all active MPLS routes
Example routing policy: Redistribution
● In JunOS, policies are made up match/action pairs– Example, announce an aggregated prefix routes in BGP
– Note: First declare policy, then export
policystatement MYNETWORK { term 1 { from { # match protocol aggregate; routefilter 192.168.2.0/24 exact; } then accept; #action } }
protocols bgp { export MYNETWORK; # Apply policy }
Routing policy: syntax and flow
● Changing the default routing policy
● Syntax:
policy-options { policy-statement name { term term-name { from { match; } then { action; } } }}
term1 term2
defaultpolicy
term3
term1 term2 term3
Policy 1
Policy 2
term
accept
reject
nextroute
Applying policies
● Export policy evaluation order: p4>p2>p0
● If verdict (accept, reject) policy chain is terminated
● Sideeffects may still apply
protocol bgp { export p0; Global properties import p1; group external-peers { type external; Group properties export p2;
import p3;neighbor 192.168.200.14{
export p4; Peer propertiesimport p5;
} }}
More match statements
You can specify actions (side-effects) apart from accept and reject:
– metric– route-filter (next slide)– Protocol– family– as-path– community– local-preference– neighbor– next-hop– origin– preference– prefix-list– ...
Routefilters
● Routefilter match types– route-filter 192.168.0.0/16 exact;
– route-filter 192.168.0.0/16 orlonger;
– route-filter 192.168.0.0/16 longer;
– route-filter 192.168.0.0/16 upto /24;
– route-filter 192.168.0.0/16 through 192.168.16.0/20;
– route-filter 192.168.0.0/16 prefix-length-range /20-/24;
192.168.0.0/16
/32
/24
/20
More actions
● accept● reject● next policy● next term● trace
Combined with accept:– as-path-expand– as-path-prepend– community– color– external– load-balance per-packet– local-preference– metric– next-hop– origin– preference
The CLI
● Two major modes:– Operational mode: Monitor and troubleshoot, network
connectivity, hardware
– Configure mode: Configuration of interfaces, routing protocols, authentication, logging, etc.
● Completion and query– As you would expect, <TAB> and <?>
● Line editing – Emacs operations: <ctrlb>, <ctrlf>, <ctrla>, <ctrle>, <ctrl
p>, <ctrln>,...
● Online help:– help reference
– help topic
Operations commands
● show
– show system storage
– show system users
– show chassis hardware detail
– show interfaces
– show route
– show route protocol direct
– show route table inet.0
– show route receive-protocol
– show route advertising-protocol
– show log
● configure
● file
– file list
– file compare
– help● help topic● help reference
– request● request system reboot
– restart● restart routing
gracefully– set
● set cli – monitor– clear– test– ping– traceroute– start shell
Extending commands
Pipe commandsLevel of detail:terse
brief
detail
extensive
Example: Example:> show route ospf extensive
| compare
| count
| display
| except
| find
| match
| resolve
| save
| trim
> show route | display xml
> show route | match 10.0
> show route | save output
# show | compare rollback 0
Configure mode: Treebased editingprotocols { bgp { export default; group external { family inet { unicast; } } } ospf { area 0.0.0.0 { interface lo0.0; interface fe0/0/0.0; } }}
protocols
bgp ospf
top
exportdefault
groupexternal
familyinet
unicast
area0.0.0.0
interface lo0.0
interface fe0/0/0
# set protocols bgp group external family inet unicast[edit]# edit protocols bgp group external[edit protocols bgp group external]# set family inet unicast[edit protocols bgp group external]# showfamily inet { unicast;}# top[edit]#
uptop
More configuration● Alternative output (set):
● Loading from file– load override
– load merge
– load relative
– load override terminal
– load set
# show | display setset protocols bgp export default_routeset protocols bgp group external family inet unicastset protocols ospf area 0.0.0.0 interface lo0.0set protocols ospf area 0.0.0.0 interface fe0/0/0.0set policyoptions policystatement default then accept
Commiting configurations● Changing the state of the router – candidate configuration.
● Commit semantics – you need to explicitly commit for changes to take effect. Variants:
– commit confirmed andquit
– commit check comment <string>
– commit sync
● You can make rollbacks to previous commits
– rollback 0 – the state before editing
– rollback 1 – previous commit
● Comparing changes– show | compare
– show | compare rollback 2
● Displaying configurations in different formats– show | display set
– show | display xml
Recommended