Upload
david-carroll
View
220
Download
2
Tags:
Embed Size (px)
Citation preview
AT&T Labs Research
Evolution of IP/OL Performance ManagementRobert Doverspike, Jennifer Yates, Jorge Pastor, Martin Birk – AT&T Labs Research
TitlePage 2AT&T Labs Research
Outline
• Key Takeaways– Performance Management – must consider interlayer
(focus IP)
• Evolution story for IP/OL– Architecture for Long Haul Networks
• Example problems
• Next chapter in evolution– Let’s get it right this time
TitlePage 3AT&T Labs Research
Key Takeaways
• Optical PM goals should focus on use in IP layer– Links in the IP layer form connections in the optical layer– Virtually all high rate connections are IP links (between
either routers or Ethernet switches)• Perfect optical layer detection is a lofty goal, but
– will fall short if architected in isolation• E.g., need to have strong inter-layer coordination
• Why do we stress this for OL?– Inter-layer fault management has many flaws in practice,
even after 15 years of SONET perfecting– Need adequate mechanisms across layers to handle
scenarios when things go wrong or confusion reigns
TitlePage 4AT&T Labs Research
Evolution Story for Long Haul Networks
SONET RingLayer
IP Layer
Pt-Pt WDM Layer
Router
ADM
DCS/Intelligent Optical Switch
Degree-n OADM/WXC
WDM Terminal
1st Generation
TitlePage 5AT&T Labs Research
Evolution Story for Long Haul Networks
SONET RingLayer DCS Layer
IP Layer
Pt-Pt WDM Layer
Router
ADM
DCS/Intelligent Optical Switch
Degree-n OADM/WXC
WDM Terminal
1st Generation
TitlePage 6AT&T Labs Research
Evolution Story for Long Haul Networks
SONET RingLayer
IP Layer
Pt-Pt WDM Layer ULH/WXC Layer
Router
ADM
DCS/Intelligent Optical Switch
Degree-n OADM/WXC
WDM Terminal
1st Generation2nd Generation
TitlePage 7AT&T Labs Research
Evolution Story for Long Haul Networks
SONET RingLayer
IP Layer
Pt-Pt WDM Layer ULH/WXC Layer
Router
ADM
DCS/Intelligent Optical Switch
Degree-n OADM/WXC
WDM Terminal
1st Generation 2nd Generation3rd Generation
TitlePage 8AT&T Labs Research
Some of the problems we’ve encountered
Ring switching impact on higher layers
• Upper layer has timer – waits for lower layer to restore – Done!
• Wrong! – not a simple decision on when to take IP link up and down
SONET RingLayer
IP Layer
X
TitlePage 9AT&T Labs Research
Some of the problems we’ve encountered1st Generation of IP/OL
• SONET alarms received by upper layer are ambiguous and conflicting• Many error types in SONET: BER, AIS, P-LOS, clear during protection
switching• Arrive at different times
• Software bugs – routers don’t behave as expected
• Inconsistencies in calculation of BER and IP layer holddown timer
SONET RingLayer
IP Layer
X
AIS-P BER-P CLR
LOS-L LOS-L
AIS-P BER-P CLR
AIS-P
PPP ACK; OSPF ping
TitlePage 10AT&T Labs Research
• No standards for inter-layer interaction– Physical layer: testers need requirement scripts to test – no
standard, no script– No industry requirement often means no testing, no sharing of
behavior– Historically, L1 and L3 labs have been separate
• Some members of Telecom community have integrated their labs
– Software bugs – routers don’t behave as expected– No specification of common parameters and metric
• Example: Router measures BER in fixed timer intervals• Router takes link down upon TCA (threshold exceeded)• Protection switching results in VERY short but high burst of
error Crosses router threshold even though it is << 10 ms!
What is the source of these problems?
TitlePage 11AT&T Labs Research
IP (logical) layer
LA
SF Washington
NY
LA
SF Washington
NY
Physical (fibre) layer
Common SRLG
• Shared Risk Groups still not well modeled– Single failure at lower layer results in multiple, scattered link failures at
higher layer – network unprepared to restore– Example: portions of dual IP access links routed over same ring – both links
taken down due to previous confusion
What is the source of these problems?
TitlePage 12AT&T Labs Research
Identity Crisis2nd Generation of IP/OL
• High speed (2.5/10/40Gbs) IP links skip SONET ring/xconnect layer and instead route over long sequences of Point-to-point WDM systems, interconnected by O/E/O optical transponders– Should the Optical Path pretend to be a transparent (like dark
fiber)• E.g., No AIS/BER TCA – re-transmit all LOS/LOP to Path Termination
Points• How does one isolate faults for repair (OTs, Amplifiers, WDM Terms)?
– OR: Should it display characteristics of SONET Section/Line/Path Fault Management Architecture?• However: then similar 1st Gen IP/SONET Ring confusion occurs
• Practicality dictated that industry implemented a combination of both approaches
TitlePage 13AT&T Labs Research
• Use long-term model of all-optical path to IP layer link• Two major issues to resolve
– What if intermediate OEO exists in near-term?– How do we model restoration at OL and how does IP layer interact?
• IP layer responsible for deciding link health
– Fast link layer detection (LOS)– GIGE and other signal are going to be transported over the 3rd Gen
OL• Is the set of PM alarms and TCAs we inherit from SONET appropriate for 3rd Gen
OL?• If not, which ones or new ones should we define?
ULH/WSC: The Final Solution?3rd Generation of IP/OL
TitlePage 14AT&T Labs Research
Some potential approaches
• OL only passes simple alarms to the upper layer, e.g., LOS.– Upper layer makes its assessments of BER, packet coding
violations, ACK failures– OL still does fault isolation for OEO components or amplifiers
(e.g., to WDM Term or EMS or Fault OSS), but NOT passed up to IP layer
– Where/how do we do this? Standards, Fora, vendor interactions, carrier requirements?
• Repair process:– Need to correlate what fails in the OL with what fails in the IP
layer (1 to many map)– Network discovery of IP/OL relationships (e.g., SRLG) across
layers would facilitate fault correlation process