View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Connect. Communicate. Collaborate
GÉANT2 monitoring
Otto Kreiter, DANTENavneet Daga, DANTE
LHC Monitoring Workshop, Munich, 19.07.2006
Connect. Communicate. CollaborateAgenda
• Extraction of monitoring information from the GÉANT2 network
• External application developed by DANTE for JRA-4• Demonstration of a home grown weather-map• Conclusion
Connect. Communicate. CollaborateNetwork Element Manager• All network elements communicate with the NM separately • NM task is to configure and monitor one by one each NE• It is not service aware – no knowledge about the intra-domain e2e path status.
Connect. Communicate. Collaborate
Regional Network Manager (RM)
TopologyServices
Correlation“User”
interface
Connect. Communicate. CollaborateAlarm content
• From the NM:– Information about interfaces and associated signal
status, SDH timing problems– NE and ILA status
• From the RM– Information related to services– Information related to path, trails and physical
connections at all layers
Connect. Communicate. CollaborateOne hop case NMS vs JRA-4
Path – gen_mil_CERN
OCH trailPhys-link Phys link
Domain linkP. ID link P. ID link
BOL-CERN-LHC-001
Connect. Communicate. CollaborateMultiple hop case NMS vs JRA-4
Path – gen_mil_CERN
OCH trailPhys-link Phys link
Domain link P. IDLink
CERN-SARA-LHC-001
OCH trailPhys-link
P. IDLink
Connect. Communicate. CollaborateAlarm processing
• SNMP traps from the Alcatel IOO module.• Alcatel Enterprise v1/v2c MIB• SNMP traps received by a Linux station
– snmptrapd to pick up all alarms– For each trap a bash script is called which performs:
• Analysis• Selection• Action
Connect. Communicate. CollaborateAlarm type & information
Alarm Raise:– friendlyName– probableCause– perceivedSeverity– currentAlarmId– eventTime– acknowledgementStatus– additionalInformation– eventType– snmpTrapAddress
Alarm Clear:– friendlyName– probableCause– currentAlarmId– eventTime– snmpTrapAddress
Connect. Communicate. CollaborateUsed alarm information
Alarm Raise:– friendlyName– probableCause– perceivedSeverity– currentAlarmId– eventTime– acknowledgementStatus– additionalInformation– eventType– snmpTrapAddress
Alarm Clear:– friendlyName– probableCause– currentAlarmId– eventTime– snmpTrapAddress
Connect. Communicate. CollaborateAlarm analyzer process
SNMP trap received
snmpTrapAddress Must be registered
Check for type Of Alarm
Raise
Additional Infopath
clientpath
ochtrail
omstrail
physicallink
recordAlarm
Call External Program
Clear
alarmID
Read recordAlarm
Call ExternalProgram
Record all traps
delete recordAl
Connect. Communicate. CollaborateAlarm analyzer
• Called every time a trap is received• Written in bash• Each trap is analyzed separately and if in the meantime a
new trap arrives it waits in the queue (snmptrapd)– Possible problem if an external program get stuck and
the scripts hangs. The alarms remains unprocessed in the queue
• Must maintain state– SNMP traps may get lost so a program needs to check
time to time if the monitoring station is in syncro with the NMS.
Connect. Communicate. CollaborateExternal applications
• JRA-4 monitoring (xml file generation)• perfSonar DB feeder• Project weather-map: LHC
Connect. Communicate. CollaborateE2E Data transformation
• Prototype applications developed in Java – – E2EXMLWriter– XMLGenerator
• E2EXMLWriter takes in a template XML and produces an XML file containing live e2e path status information conforming to the JRA4 e2e data model– Triggered by a script listening to SNMP alarms– Parameters passed
• Trail ID• Status
• XMLGenerator produces this template XML that E2EXMLWriter uses to export domain’s e2e information
Connect. Communicate. CollaborateDesign of E2EXMLWriter
• Relies on 2 configuration files to produce live XML status information– Properties file (links.properties)
• Properties file containing key = value entries• Each key is one e2e path name• Value to each key is a csv of multiple trails that form one path• Currently manually maintained
– Alarm register• A simple csv file• Application maintained• An “alarm raise” registers the associated path• An “alarm clear” de-registers the associated path
(contd).
Connect. Communicate. CollaborateDesign (contd.)
• The application sets all path’s default status as UP with admin state as NORMALOPERATION
• Only the paths “registered” in the alarm-register csv file are set as DOWN with admin state as MAINTENANCE
• No implementation of the status DEGRADED at the moment
• No implementation of other admin states at the moment
Connect. Communicate. CollaborateDesign of XMLGenerator
• Relies on 3 configuration files – – Properties file (init.properties)
• Contains a key = value entry• Key = DOMAIN• Value = <domain_name>• Enables on-the-fly domain name configuration
– Config file (config.csv)• A simple CSV file• Contains node-link-node information
– A sample XML file containing “pieces of XML” to be replicated for each node and link in the final output “template XML”
• All configuration files are currently manually maintained
Connect. Communicate. CollaborateData Provision
• Currently, the final XML containing live e2e path status information is written to a URL for export– http://unix.dante.org.uk/~otto/jra4-cbf.xml
• Later, maybe integration with perfSONAR framework
Connect. Communicate. CollaborateperfSonar feeder
• Enters data in the perfSonar MA
• Takes as input:– Type of logical link: trunk, trail, physical link or path.– Name: friendlyName– Time: the time when the event occurred– Status: UP/Down– Alarm ID
Connect. Communicate. CollaborateLHC weather-map live demonstration
1. CERN user-side down
2. CERN user-side up
3. GEN-MIL Lambda down
4. GARR user-side down
5. Back-to-back interconnection in DE broken
6. AMS-FRA lambda down
7. Up DE interconnection
8. AMS-FRA lambda up
9. GARR user-side up
10. GEN-MIL lambda up
Connect. Communicate. CollaborateConclusion
• Status monitoring via alarms in an advanced phase and well understood.– Once the characteristic of the equipment/alarms/faults
understood the development was easy.• Alarm collector can be reused by NRENs using Alcatel
equipment.• XMLGenerator and perfSonar feeder not bonded to a
specific equipment.