View
4
Download
0
Category
Preview:
Citation preview
Migration to an MPLS/VPLS based platform
Henk SteenmanHenk.Steenman@ams-ix.net
Apricot 2010March 2 2010
AMS-IX version 4
Monday, March 1, 2010
Overview
‣ AMS-IX version 3
‣ Short overview
‣ Bottlenecks and limitations
‣ AMS-IX version 4
‣ The MPLS/VPLS platform
‣ AMS-IX v3 to v4 migration
‣ Current status
Monday, March 1, 2010
June 2009 situation before start of migration
AMS-IX version 3
GlobalSwitch
10/100/1000
10GE
pxc-nik-103pxc-nik-104
NIKHEF
10GE
10/100/1000
pxc-nik-108
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-sar-101pxc-sar-102
pxc-sar-111
SARA
10/100/1000
10GE
pxc-sar-114pxc-nik-112 pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
stub-nik-211MLX16
stub-nik-214MLX16
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-319MLX16
stub-glo-219MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-sar-322MLX32
stub-sar-222MLX32
stub-nik-311MLX16
stub-nik-314MLX16
stub-tel-320MLX32
stub-tel-220MLX32
core-eun-301MLX32
core-glo-201MLX32
edge-sar-001BI15000
edge-tel-005BI15000
edge-nik-003BI15000
edge-eun-014MLX8
edge-glo-007BI15000
edge-eqx-015MLX8edge-sar-001
BI15000edge-nik-003BI15000
edge-tel-005BI15000
Monday, March 1, 2010
CharacterizationAMS-IX version 3
‣ Two networks: one active at any moment in time
‣ Selection of active network by VSRP (Brocade proprietary
‣ Inactive network core-switch blocks ports to prevent loops
‣ PSCD, photonic switch control daemon
‣ AMS-IX developed software to act on VSRP traps and manage PXCs
Monday, March 1, 2010
CharacterizationAMS-IX version 3
‣ E, FE and (N *) GE connections on BI-15k or RX8 switches
‣ (N * ) 10GE connections resilient connected on switching platform (MLX16 or MLX32) via PXCs
‣ Brocade “port security” on customer interface to enforce one MAC per port rule for loop prevention
Monday, March 1, 2010
Daily and yearly traffic Aug 2009AMS-IX customer traffic
Monday, March 1, 2010
Long Term 10 and 40/100G customer Port Predictions
Traffic and Port Prognoses
!"
#!"
$!!"
$#!"
%!!"
%#!"
&!!"
&#!"
'!!"
'#!"
#!!"
$()*+(,,"
$()*+(!!"
&$(-./(!!"
&$(-./(!$"
&$(-./(!%"
&$(-./(!&"
&!(-./(!'"
&!(-./(!#"
&!(-./(!0"
&!(-./(!1"
%,(-./(!2"
%,(-./(!,"
%,(-./($!"
%,(-./($$"
%2(-./($%"
3"45678"
9:8756:/"*+;"<6.;:/7.;"=>875?.6"<5678"
@A"
BA"
$!BA"
$!BA"C5+D"7.6?"<6.;:/E5+"
'!F$!!BA"
Monday, March 1, 2010
Bottlenecks and Limitations
AMS-IX version 3
‣ Core switches (MLX32, 128 10GE line rate) fully utilized
‣ Limits ISL upgrade
‣ Summer 2009 no substantial bigger switches on the market
‣ Platform failover introduces short link-flap on all 10GE customer ports. In few (but increasing) cases this leads to BGP flapping
‣ With more and more 10GE customer ports (268 June 2009) impact on overall platform stability becomes larger and larger
‣ Growth of number of 10G connections and 10GE customer LAG size requires larger 10GE access switches
‣ Smaller switches => less local switching => larger ISL trunksMonday, March 1, 2010
AMS-IX version 4An MPLS/VPLS based Exchange platform
Monday, March 1, 2010
RequirementsAMS-IX version 4
‣ Scale the core to at least double amount of ports (Q2/3 2009)
‣ Keep resilience in platform and 10GE access but reduce impact on failover.
‣ Increase amount of 10GE customer ports on access switches
‣ More local switching
‣ Migrate to single architecture platform
‣ Reduce management overhead
‣ Use future proof (3 to 5 years) hardware that allows upscaling to high-density 10GE (2010) and 40/100GE (end 2010, early 2011)
Monday, March 1, 2010
Complete MPLS/VPLS topology
AMS-IX version 4
GlobalSwitch
10/100/1000
10GE
pxc-sar-101pxc-sar-102
SARA
10GE
10/100/1000
pxc-sar-111
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-nik-103pxc-nik-104
pxc-nik-108
NIKHEF
10/100/1000
10GE
pxc-nik-112pxc-sar-114pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
core-eun-302MLX32
core-eun-301MLX32
core-glo-202MLX32
core-glo-201MLX32
stub-nik-221MLX32
stub-nik-321MLX32
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-313MLX16
stub-glo-213MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-tel-320MLX32
stub-tel-220MLX32
stub-sar-222MLX32
stub-sar-322MLX32
edge-sar-001MLX8
edge-tel-005MLX8
edge-nik-003MLX8
edge-eun-008MLX8
edge-glo-007MLX8
edge-eqx-009MLX8
Monday, March 1, 2010
OverviewAMS-IX version 4
‣ MPLS/VPLS-based peering platform
‣ Scaling of core switches by adding extra switches in parallel
‣ 4 LSPs between each pair of access switches
‣ Load balancing of traffic over 4 LSPs between each pair of access switches
‣ Retain 10GE access switch resilience
‣ Keep 10GE customer connection on PXC
‣ No need for complete platform failover anymore
‣ Local impact only (single pair of access switches on a site)
Monday, March 1, 2010
CharacterizationAMS-IX version 4
‣ OSPF
‣ BFD for fast detection of link failures
‣ RSVP-TE signalled LSPs over predefined paths
‣ primary and secondary (backup) paths defined
‣ VPLS instance per VLAN
‣ Static defined VPLS peers (LDP signalled)
‣ Load balanced over parallel LSPs over all core routers
‣ Layer 2 ACLs instead of Port Security
‣ Manual adjustment for nowMonday, March 1, 2010
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
MPLS/VPLS setup
PhysicalInterconnection
Monday, March 1, 2010
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
MPLS/VPLS setup
PhysicalInterconnection
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
Pre-defined paths between PEsover each core router
Monday, March 1, 2010
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
MPLS/VPLS setupResilience
Monday, March 1, 2010
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
MPLS/VPLS setupResilience
LSP over primary Path
LSP over backup Path
Monday, March 1, 2010
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
MPLS/VPLS setupResilience
LSP over primary Path
LSP over backup Path
stub-tel-22010.10.2.20/32
stub-nik-21410.10.2.14
core-glo-20110.10.2.1
core-glo-20210.10.2.2
core-eun-30110.10.3.1/32
core-eun-30210.10.3.2
stub-tel-32010.10.3.20/32
pxc-sar-101
Resilience in access connectionby means of PXC
Monday, March 1, 2010
Requirement: No customer impact !
AMS-IX v3 to v4 migration
GlobalSwitch
10/100/1000
10GE
pxc-nik-103pxc-nik-104
NIKHEF
10GE
10/100/1000
pxc-nik-108
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-sar-101pxc-sar-102
pxc-sar-111
SARA
10/100/1000
10GE
pxc-sar-114pxc-nik-112 pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
stub-nik-211MLX16
stub-nik-214MLX16
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-319MLX16
stub-glo-219MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-sar-322MLX32
stub-sar-222MLX32
stub-nik-311MLX16
stub-nik-314MLX16
stub-tel-320MLX32
stub-tel-220MLX32
core-eun-301MLX32
core-glo-201MLX32
edge-sar-001BI15000
edge-tel-005BI15000
edge-nik-003BI15000
edge-eun-014MLX8
edge-glo-007BI15000
edge-eqx-015MLX8edge-sar-001
BI15000edge-nik-003BI15000
edge-tel-005BI15000
GlobalSwitch
10/100/1000
10GE
pxc-sar-101pxc-sar-102
SARA
10GE
10/100/1000
pxc-sar-111
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-nik-103pxc-nik-104
pxc-nik-108
NIKHEF
10/100/1000
10GE
pxc-nik-112pxc-sar-114pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
core-eun-302MLX32
core-eun-301MLX32
core-glo-202MLX32
core-glo-201MLX32
stub-nik-221MLX32
stub-nik-321MLX32
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-313MLX16
stub-glo-213MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-tel-320MLX32
stub-tel-220MLX32
stub-sar-222MLX32
stub-sar-322MLX32
edge-sar-001MLX8
edge-tel-005MLX8
edge-nik-003MLX8
edge-eun-008MLX8
edge-glo-007MLX8
edge-eqx-009MLX8
?
Monday, March 1, 2010
Migration steps: Initial situation
AMS-IX v3 to v4 migration
GlobalSwitch
10/100/1000
10GE
pxc-nik-103pxc-nik-104
NIKHEF
10GE
10/100/1000
pxc-nik-108
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-sar-101pxc-sar-102
pxc-sar-111
SARA
10/100/1000
10GE
pxc-sar-114pxc-nik-112 pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
stub-nik-211MLX16
stub-nik-214MLX16
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-319MLX16
stub-glo-219MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-sar-322MLX32
stub-sar-222MLX32
stub-nik-311MLX16
stub-nik-314MLX16
stub-tel-320MLX32
stub-tel-220MLX32
core-eun-301MLX32
core-glo-201MLX32
edge-sar-001BI15000
edge-tel-005BI15000
edge-nik-003BI15000
edge-eun-014MLX8
edge-glo-007BI15000
edge-eqx-015MLX8edge-sar-001
BI15000edge-nik-003BI15000
edge-tel-005BI15000
Monday, March 1, 2010
PreparationPlatform Migration
‣ Build new version of PSCD (Photonic Switch Control Deamon)
‣ No VSRP traps but LSP state in MPLS cloud
‣ Develop configuration automation
‣ Describe network in XML, generate configurations from this
‣ Move non MPLS capable access switches behind MPLS routers and PXC as a 10GE customer connection
‣ Upgrade all non MPLS capable 10GE access switches
‣ Define migration scenario that would have no customer impact
Monday, March 1, 2010
Migration steps: Initial situation simplified
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
10GE
access
10GE
access
10GE
access
GE
access
GE
access
core core
PXCPXC
N * 10GE < n*1 GE
customer routers customer routers
10GE
access
N * 10GE < n*1 GE
2 Co-location sites only for simplicity
Double L2 network
VSRP for master slave selection and loop protection
Monday, March 1, 2010
Migration steps: move GE access behind PXC
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
10GE
access
10GE
access
10GE
access
10GE
access
GE
access
GE
access
core core
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
Not possible to connect GE access switch to both MPLS/VPLS cloud and basic L2 network
Brocade BI-15K (end of life) not MPLS capable
Monday, March 1, 2010
Migration steps: Migrate one half to MPLS/VPLS
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
10GE
access
10GE
access
core
PE
router
PE
router
P router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
Production on L2 network (red)
Migrate blue network to MPLS/VPLS Traffic between two PE routers load balanced over 2 LSPs, one over each P router
Test functionality and connections using test traffic sent by Anritsu traffic generators
Monday, March 1, 2010
Migration steps: Production on MPLS/VPLS, L2 backup
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
10GE
access
10GE
access
core
PE
router
PE
router
P router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
Move production traffic to MPLS/VPLS cloud• Use PXCs for failover• New PSCD
Run production on MPLS/VPLS cloud for 6 weeks
Monday, March 1, 2010
Migration steps: Two MPLS/VPLS platforms
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
PE
Router
P Router
PE
Router
PE
router
P router
PE
router
P Router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
Migrate second half of the platform to MPLS/VPLS
Test functionality and connections using test traffic sent by Anritsu traffic generators
Monday, March 1, 2010
Migration steps: production on second MPLS/VPLS platform
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
PE
Router
P Router
PE
Router
PE
router
P router
PE
router
P Router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
Move production traffic to red MPLS/VPLS cloud using the newly developed version of PSCD to manage the PXCs
Still two separate networks, both MPLS/VPLS based
Monday, March 1, 2010
Migration steps: integration to single MPLS/VPLS cloud
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
customer routers
PXC PXC
P Router
PE
Router
P Router
PE
Router
PE
router
P router
PE
router
P Router
customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
All PE routers connected to all P routers• Between each pair of PE routers,
4 LSPs. One over each P router• Traffic between each pair of PE
routers load balanced over the 4 LSP
10GE customer connections distributed over local PE routers• Resilience in 10GE customer
connection to local PE router by means of PXCs
• Failover of customer connection to other PE router independent of other connections
Monday, March 1, 2010
Final situation
AMS-IX v3 to v4 migration
GlobalSwitch
10/100/1000
10GE
pxc-sar-101pxc-sar-102
SARA
10GE
10/100/1000
pxc-sar-111
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-nik-103pxc-nik-104
pxc-nik-108
NIKHEF
10/100/1000
10GE
pxc-nik-112pxc-sar-114pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
core-eun-302MLX32
core-eun-301MLX32
core-glo-202MLX32
core-glo-201MLX32
stub-nik-221MLX32
stub-nik-321MLX32
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-313MLX16
stub-glo-213MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-tel-320MLX32
stub-tel-220MLX32
stub-sar-222MLX32
stub-sar-322MLX32
edge-sar-001MLX8
edge-tel-005MLX8
edge-nik-003MLX8
edge-eun-008MLX8
edge-glo-007MLX8
edge-eqx-009MLX8
Monday, March 1, 2010
Results
‣ Increased stability
‣ Backbone failures handled by MPLS (not seen by customers)
‣ Access switch failures handled for a single pair of switches
‣ Phased relocation of traffic streams
‣ Looped traffic filtered by L2 ACL => No effect on linecard CPU
Monday, March 1, 2010
Results
‣ Easier debugging of customer ports
‣ Simply swap to different, active switch using Glimmerglass PXC
‣ Config generation
‣ Absolute necessity due to size of MPLS/VPLS configuration
‣ Fairly simple because of single hardware platform
Monday, March 1, 2010
Results
‣ Scalability (future options) ‣ Bigger core devices
‣ Do not need to be MPLS-capable
‣ Load-sharing over > 4 cores
‣ Pending feature request
‣ Use of different cores for sets of PEs
‣ Multiple layers of P-routers
Monday, March 1, 2010
Questions ?
Monday, March 1, 2010
SubTitle
Backup Slides
Monday, March 1, 2010
10GE Access10GE access
Resilience in 10GE customer access connection
AMS-IX Photonic Cross Connects (PXC)
PXC
X * 10GE, X >= 1
10GE access Switches
Monday, March 1, 2010
10GE Access10GE access
Resilience in 10GE customer access connection
AMS-IX Photonic Cross Connects (PXC)
10GE Access10GE access
PXC
X * 10GE, X >= 1
10GE access Switches
Monday, March 1, 2010
10GE Access10GE access
Resilience in 10GE customer access connection
AMS-IX Photonic Cross Connects (PXC)
10GE Access10GE access 10GE Access10GE access
PXC
X * 10GE, X >= 1
10GE access Switches
Monday, March 1, 2010
Topology Failover
Core Red
MASTER
10GE Access rood
10GE Access Blue
Edge
PXC
Core Blue
BACKUP
Customer 10GE router
Customer 1GE router
VSRP
VSRP HELLO
BLOCKED
for traffic
BLOCKED
for traffic
Core Red
BACKUP
10GE Access rood
10GE Access Blue
Edge
pxc-sar-101
Core Blue
MASTER
Customer 10GE router
Customer 1GE router
VSRP
VSRP HELLO
BLOCKED
for traffic
BLOCKED
for traffic
AMS-IX Version 3 Platform
Problem or maintenancein red network
VSRP PriorityRed Master lowerthan the Blue priority
Monday, March 1, 2010
Recommended