15
1 Network Monitoring for SCIC Les Cottrell, SLAC ICFA/SCIC meeting August 24, 2005 www.slac.stanford.edu/grp/scs/net/talk05/ icfa-netmon-aug05.ppt Initially funded by DoE Field Work proposal. Currently partially funded by US Department of State/Pakistan Ministry of Science & Technology

Network Monitoring for SCIC

Embed Size (px)

DESCRIPTION

Network Monitoring for SCIC. Les Cottrell, SLAC ICFA/SCIC meeting August 24, 2005 www.slac.stanford.edu/grp/scs/net/talk05/icfa-netmon-aug05.ppt. Initially funded by DoE Field Work proposal. Currently partially funded by US Department of State/Pakistan Ministry of Science & Technology. - PowerPoint PPT Presentation

Citation preview

Page 1: Network Monitoring for SCIC

1

Network Monitoring for SCIC

Les Cottrell, SLAC ICFA/SCIC meeting

August 24, 2005www.slac.stanford.edu/grp/scs/net/talk05/icfa-netmon-

aug05.ppt

Initially funded by DoE Field Work proposal. Currently partially funded by US Department of State/Pakistan Ministry

of Science & Technology

Page 2: Network Monitoring for SCIC

2

Coverage• Measure the network performance for developing regions

– From developed to developing & vice versa– Between developing regions & within developing regions

• Originated in High Energy Physics, now focused on DD– Adding monitoring sites in: Africa, S. America, Russia, Pakistan, India– Working with Turkey but ISP blocks pings

• http://www-iepm.slac.stanford.edu/pinger/pingerworld/ – Interactive: zoom/pan, mouseover, clickable

PingER coverage Aug 2005

Monitoring siteRemote site

Page 3: Network Monitoring for SCIC

3

PingER Management• No funding for PingER ongoing operational

management (40% FTE at the moment)– Develop tools to simplify, automate, reduce manual

effort– New installation procedures of monitor site– Assistance to producing executive plots– Provide alerts for unreachable remote sites– Provide alerts if unable to gather data from monitor

sites– Check sanity of data and the configuration database– Check host are where we think they are…

Page 4: Network Monitoring for SCIC

4

Triangulation 1/2• Web hosts with TLDs in many developing

countries have proxies in developed countries– E.g. 50% of initially chosen Pakistan Universities

had web proxies outside Pakistan– Use IP2Location.com & traceroute to verify location, – working on triangulation

• Make RTTmin measures to given host from known landmarks

• Estimate distance from landmark using d= aL* RTTmin + bL

– Initial aL ~ 50km/ms (speed of light in fiber, factor of 2 for right of way paths, non great-circle-route hop locations), bL = 0.

• Optimize aL, bL using RTTmin for known PingER pairs

• Locate host lat/long with confidence estimates

Page 5: Network Monitoring for SCIC

5

Triangulation 2/2• Landmarks:

– Using Looking Glass servers (provide pings)– Install web accessible on demand ping tool at

PingER monitoring sites– Use GeoLIM landmarks (for US & W. Europe)

• Installing GeoLIM landmark at NIIT

• Will build tool to validate where PingER nodes are really located and fix database or replace

Page 6: Network Monitoring for SCIC

6

Integrate with MonALISA• Mainly to look at closer to real time displays

– Code is ready, looking for host and disk space to save data

Page 7: Network Monitoring for SCIC

7

Case study on Pakistan• Two sites to join LCG (NUST, QEA/NCP), is

connectivity adequate?

• Prompted by two outages of SEAMEW3– Fiber cut off Karachi causes 12 day outage Jun-Jul

‘05• Huge losses of confidence and business

Page 8: Network Monitoring for SCIC

8

Fiber Outage Jun 27-Jul 8 ‘05• Looked at 9 sites in Pakistan measured from

within and outside Pakistan– Saw big (300=>600ms) increase in min-RTT as

some sites switched to satellite– Losses 2-3% => >10%– Unreachability 1-2%=>20%– Effect varied by site

Los

s %

Jan04 Jun050

1475%Median25%

Pakistan loss from SLAC

Page 9: Network Monitoring for SCIC

9

Longer term

• Infrastructure appears fragile• Losses to QEA & NIIT are 3-8% averaged over month

RT

T m

s

Los

s %

Feb05 Jul05Jun/Jul outage

Another fiber outage, this time of 3 hours!Power cable dug up by excavators of Karachi Water & Sewage Board

• Typically once a month losses go to 20%

Page 10: Network Monitoring for SCIC

10

Pakistan: Next steps• Established contacts with PERN (manages

E&R net connections) and NTC (carrier, government monopoly) and PIE (Pakistan Internet Exchange - international carrier interface)– Monitoring PIE backbone router in Karachi

• NTC router deprecate pings so can’t monitor it

– Establishing PingER monitors in PERN and NTC• Already have one at NIIT.• Want to pin-point causes of poor performance (losses,

unreachability)

– Monitoring to NIIT via NTC and Broadband/DSL provider to compare providers.

Page 11: Network Monitoring for SCIC

11

First results from S. Africa• Host at Tertiary Education Network (TENET)

site at Ronderbush– TENET secures for ZA universities & technical

colleges management of service contracts, operational functions, other value added services

• Monitoring about 45 beacon sites worldwide• Land line links to world, min-RTTs:

– Europe: ~215ms; US: ~250ms; Russia: ~235ms; – L. America: ~415ms; E. Asia: ~450ms; Pakistan: ~

465ms; Australia: ~ 480ms

• Evaluating what sites in Africa to monitor

Page 12: Network Monitoring for SCIC

12

Collaborations/funding• Good news:

– Active collaboration with NIIT Pakistan to develop network monitoring including PingER (in particular management)

• Travel funded by US State department & Pakistan MOST for 1 year• Have submitted a follow on proposal to USAID

– FNAL & SLAC continue support for PingER management and coordination

• Bad news (currently unfunded, could disappear):– DoE funding for PingER terminated– Harder to cover from SLAC HEP budget, given new project oriented

budgeting– Proposal to EC 6th framework with ICTP, ICT Cambridge UK, CONAE

Argentina, Usikov Inst Ukraine, STAC Vietnam VUB Belgium rejected, also proposal to IDRC/Canada February ‘04 rejected

– Working with ICTP proposal• Hard to get funding for operational needs (~0.4 FTE)

– For quality data need constant vigilance (host disappear/move, security blocks pings, need to update remote host lists …), harder as more/remoter hosts

Page 13: Network Monitoring for SCIC

13

Overall Situation

• Performance from U.S. & Europe is improving all over, for losses, RTT & throughput

• Performance to developed countries are orders of magnitude better than to developing countries

• Poorer regions 5-10 years behind• Poorest regions Africa, Central & S. Asia• Some regions are:

– catching up (SE Europe, Russia), – keeping up (Latin America, Mid East, China), – falling further behind (e.g. India, Africa)

Page 14: Network Monitoring for SCIC

14

Future Focii• First view of Africa from within Africa

• Impact of Gloriad for Russian connectivity

• Impact of new RNP initiatives for Brazil

• More on India (preparation for CHEP06)

• Finish off the study of Pakistan

• Impact of new connectivity in E. Asia

• Others (suggestions welcome…)

Page 15: Network Monitoring for SCIC

15

Further Information• PingER project home site

– www-iepm.slac.stanford.edu/pinger/

• PingER methodology (presented at I2 Apr 22 ’04)– www.slac.stanford.edu/grp/scs/net/talk03/i2-method-apr04.ppt

• ICFA/SCIC Network Monitoring report– www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan05/20050206-netmon.

doc

• ICFA/SCIC home site– http://icfa-scic.web.cern.ch/ICFA-SCIC/

• SLAC/NIIT collaboration– http://maggie.niit.edu.pk/

• Pakistan outage: www.slac.stanford.edu/grp/scs/net/case/pakjul05/jun-july.htm