52
Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge Prepared by Les Cottrell, SLAC for the NIIT , February 22, 2006

Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

  • Upload
    dinah

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge. Prepared by Les Cottrell, SLAC for the NIIT , February 22, 2006. Stanford University. Location. Some facts. Founded in 1890’s by Governor Leland Stanford & wife Jane in memory of son Leland Stanford Jr. - PowerPoint PPT Presentation

Citation preview

Page 1: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Prepared by Les Cottrell, SLAC

for the NIIT , February 22, 2006

Page 2: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Stanford University• Location

Page 3: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Some facts• Founded in 1890’s by Governor Leland Stanford &

wife Jane – in memory of son Leland Stanford Jr.– Apocryphal story of foundation

• Movies invented at Stanford

• 1600 freshman entrants/year (12% acceptance), 7:1 student:faculty, students from 53 countries

• 169K living Stanford alumni

Page 4: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Some alumni• Sports: Tiger Woods, John McEnroe

• Sally Ride Astronaut

• Vint Cerf “father of Internet”

• Industry:– Hewlett & Packard, Steve Ballmer CEO Microsoft, Scott

McNealy Sun …

• Ex-presidents: Ehud Barak Israel, Alejandro Toledo Peru

• US Politics: Condoleeza Rice, George Schultz, President Hoover

Page 5: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Some Startups• Founded Silicon Valley (turned orchards into

companies): – Start by providing land and encouragement (investment)

for companies started by Stanford alumni, such as HP & Varian

– More recently: Sun (Stanford University Network), Cisco, Yahoo, Google

Page 6: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Excellence• 17 Nobel prizewinners• Stanford Hospital• Stanford Linear Accelerator Center (SLAC) – my home:

– National Lab operated by Stanford University funded by US Department of Energy

– Roughly 1400 staff, + contractors & outside users => 3000, ~ 2000 on site at a given time

– Fundamental research in:• Experimental particle physics• Theoretical physics• Accelerator research• Astro-physics• Synchrotron Light research

– Has faculty to pursue above research and awards degrees, 3 Nobel prizewinners

Page 7: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Work with NIIT• Co-supervision of students, build research capacity, publish etc., for

example:– Quantify the Digital Divide:

• Develop a measurement infrastructure to provide information on the extent of the Digital Divide:

– Within Pakistan, between Pak & other regions

• Improve understanding, provide planning information, expectations, identify needs• Provide and deploy tools in Pakistan

• MAGGIE-NS collaboration - projects:– TULIP - Faran– Network Weather Forecasting – Fawad, Fareena– Anomaly – Fawad, Adnan, Muhammad Ali

• Detection, diagnosis and alerting

– PingER Management - Waqar– MTBF/MTTR of networks – Not assigned– Federating Network monitoring Infrastructures – Asma, Abdullah

• Smokeping, PingER, AMP, MonALISA, OWAMP …

– Digital Divide – Aziz, Akbar, Rabail

Page 8: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Quantifying the Digital Divide: A scientific overview of the connectivity of South Asian and African Countries

Les CottrellSLAC, Aziz RehmatullahNIIT, Jerrod WilliamsSLAC, Arshad AliNIIT

Presented at the CHEP06 Meeting, Mumbai, India February 2006www.slac.stanford.edu/grp/scs/net/talk05/icfa-chep06.ppt

Page 9: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Introduction• PingER project originally (1995) for measuring

network performance for US, Europe and Japanese HEP community

• Extended this century to measure Digital Divide for Academic & Research community

• Last year added monitoring sites in S. Africa, Pakistan & India

• Will report on network performance to these regions from US and Europe – trends, comparisons

• Plus early results within and between these regions

Page 10: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Why does it matter?

• Scientists cannot collaborate as equal partners unless they have connectivity to share data, results, ideas etc.

• Distance education needs good communication for access to libraries, journals, educational materials, video, access to other teachers and researchers.

Page 11: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

PingER coverage• ~120 countries (99% world’s connected population), 35

monitor sites in 14 countries• New monitoring sites in Cape Town, Rawalpindi, Bangalore• Monitor 25 African countries, contain 83% African

population

Page 12: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Minimum RTT from US• Indicates best possible, i.e. no queuing• >600ms probably geo-stationary satellite• Only a few places still using satellite, mainly Africa• Between developed regions min-RTT dominated by distance

– Little improvement possibleJan 2000

Dec 2003

Page 13: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

World thruput seen from US

• Derived throughput~MSS/(RTT*sqrt(loss)), Mathis

Behind Europe6 Yrs: Russia, Latin America 7 Yrs: Mid-East, SE Asia10 Yrs: South Asia11 Yrs: Cent. Asia12 Yrs: Africa

South Asia, Central Asia, and

Africa are in Danger of Falling

Even Farther Behind

Many sites in DD have less

connectivity than a residence in US or

Europe

Page 14: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

S. Asia & Africa from US• Data v. noisy but

there are noticeable trends

• India may be holding its own

• Africa & Pakistan are falling behind Pakistan

Page 15: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Compare to US residence• Sites in many countries have bandwidth< US residence

Page 16: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

India to India• Monitoring host in Bangalore from Oct ’05

– Too early to tell much, also need more sites, have some good contacts

• 3 remote hosts (need to increase): – R&E sites in Mumbai & Hyderabad– Government site in AP

• Lot of difference between sites, Gov. site sees heavy congestion

0

50

100

150

200

250

300

350

400

9-O

ct

8-N

ov

8-D

ec

7-J

an

6-F

eb

RT

T m

s

Govt of. APHyderabad, APMumbai, Maharashtra

Average - Minimum RTT from Banglore

Page 17: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

PERN: Network ArchitecturePERN: Network Architecture

DRS DRSDRS

KarachiCore ATM/Router

IslamabadCore ATM/Router

LahoreCore ATM/Router

2x2Mbps 2x2Mbps

2x2MbpsLAN SwitchLAN Switch

AccessRouter

DXX DXXOFS

OFNode

University

University

Customer

Replica of Kr./Iba

International 2MB

DXX DXX OFS

University University

12 Universities

22 Universities

23 Universities

AccessRouter

University

International 4MB International

2MB

DRS

OFS

50 Mbps

50 Mbps

50 Mbps

57 Mbps

65 Mbps

33 Mbps

HEC will invest $ 4M in BackboneHEC will invest $ 4M in Backbone3 To 9 Points-of-Presence (Core Nodes) 3 To 9 Points-of-Presence (Core Nodes) $ 2.4M from HEC to Public Universities for Last Mile Costs$ 2.4M from HEC to Public Universities for Last Mile CostsPossible Dark Fiber InitiativePossible Dark Fiber Initiative

Page 18: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Pakistan to Pakistan• 3 monitoring sites in Islamabad/Rawalpindi

– NIIT via NTC, NIIT via Micronet, NTC (PERN supplier)– All monitor 7 Universities in ISB, Lahore, KHI, Peshawar

• Careful: many University sites have proxies in US & Europe

• Minimum RTTs: best NTC 6ms, NIIT/NTC 10ms - extra 4ms for last mile, NIIT/Micronet 60ms – slower links different routes

• Queuing = Avg(RTT)-Min(RTT) – NIIT/NTC heavily congested

• 200-400ms queuing

– Better when students holiday– NIIT/Micronet & NTC OK– Outages show fragility

Avg - Min RTT from 3 monitoring Sites in Pakistan to Pakistan

0

50

100

150

200

250

300

350

400

450

500

1-Dec 11-Dec 21-Dec 31-Dec 10-Jan 20-Jan 30-Jan

RT

T m

s

Median NIIT N2Median NTCMedian NIIT N4

The PingER Project:http://www-iepm.slac.stanford.edu/pinger/

NIIT

Holiday

Page 19: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Pakistan Network Fragility

NIIT/Micronet

NIIT/NTC

NTC

NIIT/NTC heavily congested

Other sites OK

NIIT outage

Remote host outages

Page 20: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Pakistan International fragility

• Infrastructure appears fragile• Losses to QEA & NIIT are 3-8% averaged over month

RT

T m

s

Los

s %

Feb05 Jul05Fiber cut off Karachi causes 12 day outage Jun-Jul ’05, Huge losses of confidence and business

Another fiber outage, this time of 3 hours!Power cable dug up by excavators of Karachi Water & Sewage Board

• Typically once a month losses go to 20%

Page 21: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Many systemic factors:Electricity, Import duties,Skills

M. Jensen

Page 22: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Bandwidth per networked computern=73

0.32

36.57

3.36

0

5

10

15

20

25

30

35

40

Minimum Maximum Mean

Mean Kbps per

networked computer

Users per networked computers by regionsn=66

11

171

50

63

15

55

0 50 100 150 200

Southern Africa

Central Africa

East Africa

West Africa

North Africa

Average

Regions

Average number of users per networked computer

Average Cost $ 11/kbps/Month

Page 23: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Routing in Africa• Seen from ZA

• Only Botswana & Zimbabwe are direct

• Most go via Europe or USA

• Wastes costly international bandwidth

Page 24: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Loss within Africa

Page 25: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Satellites vs Terrestrial• Terrestrial links via SAT3 & SEAMEW (Mediterranean)• Terrestrial not available to all within countries, EASSy will help

PingER min-RTT measurements fromS. African TENET monitoring station

Page 26: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Between Regions• Red ellipses show

within region• Blue = min(RTT)• Red = min-avg RTT• India/Pak green

ellipses• ZA heavy congestion

– Botswana, Argentina, Madascar, Ghana, BF

• India better off than Pak

Page 27: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Overall• Sorted by Median throughput• Within region performance better (blue ellipses)• Europe, N. America, E. Asia Russia generally good• M. East, Oceania, S.E. Asia, L. America acceptable• Africa, C. Asia, S. Asia poor

Page 28: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Examples• India got Internet connectivity in 1989, China 1994

– India is 34Mbits/s backbones, one possible 622Mbits/s– China is deploying multi 10Gbits/s

• Brazil and India had similar International connectivity in 2001, now Brazil is at multi-Gbits/s

• Pakistan PERN backbone is 50Mbits/s, and end sites are ~1Mbits/s

• Growth in # Internet users (2000-2005): 420% Brazil, China 393%, 5000% Pakistan, 900% India, demand outstripping growth– www.internetworldstats.com/stats.htm

Page 29: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Conclusions• S. Asia and Africa ~ 10 years behind and falling

further behind creating a Digital Divide within a Digital Divide

• India appears better than Africa or Pakistan• Last mile problems, and network fragility• Decreasing use of satellites, still needed for many

remote countries in Africa and C. Asia– EASSy project will bring fibre to E. Africa

• Growth in # users 2000-2005 400% Africa, 5000% Pakistan networks not keeping up

• Need more sites in developing regions and longer time period of measurements

Page 30: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

More information

• Thanks to: Harvey Newman & ICFA for encouragement & support, Anil Srivastava (World Bank) & N.Subramanian (Bangalore) for India, NIIT, NTC and PERN for Pakistan monitoring sites, FNAL for PingER management support, Duncan Martin & TENET (ZA).

• Future: work with VSNL & ERnet for India, Julio Ibarra & Eriko Porto for L. America, NIIT & NTC for Pakistan

• Also see:• ICFA/SCIC Monitoring report:

– www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan06/ • Paper on Africa & S. Asia

– www.slac.stanford.edu/grp/scs/net/papers/chep06/paper-final.pdf • PingER project:

– www-iepm.slac.stanford.edu/pinger/

Page 31: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

SC|05 Bandwidth Challenge

ESCC Meeting

9th February ‘06

Yee-Ting Li

Stanford Linear Accelerator Center

Page 32: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

LHC Network Requirements

CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1

Tier 1

Tier2 Center

Online System

CERN Center PBs of Disk;

Tape Robot

FNAL CenterIN2P3 Center INFN Center RAL Center

InstituteInstituteInstituteInstitute

Workstations

~150-1500 MBytes/sec

~10 Gbps

1 to 10 Gbps

Tens of Petabytes by 2007-8.An Exabyte ~5-7 Years later.

Physics data cache

~PByte/sec

10 - 40 Gbps

Tier2 CenterTier2 CenterTier2 Center

~1-10 Gbps

Tier 0 +1

Tier 3

Tier 4

Tier2 Center Tier 2

Experiment

Page 33: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Overview• Bandwidth Challenge

– ‘The Bandwidth Challenge highlights the best and brightest in new techniques for creating and utilizing vast rivers of data that can be carried across advanced networks.‘

– Transfer as much data as possible using real applications over a 2 hour window

• We did…– Distributed TeraByte Particle Physics Data Sample Analysis– ‘Demonstrated high speed transfers of particle physics data

between host labs and collaborating institutes in the USA and worldwide. Using state of the art WAN infrastructure and Grid Web Services based on the LHC Tiered Architecture, they showed real-time particle event analysis requiring transfers of Terabyte-scale datasets.’

Page 34: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Overview• In detail, during the bandwidth challenge (2 hours):

– 131 Gbps measured by SCInet BWC team on 17 of our waves (15 minute average)

– 95.37TB of data transferred.• (3.8 DVD’s per second)

– 90-150Gbps (peak 150.7Gbps)• On day of challenge

– Transferred ~475TB ‘practising’ (waves were shared, still tuning applications and hardware)

– Peak one way USN utlisation observed on a single link was 9.1Gbps (Caltech) and 8.4Gbps (SLAC)

• Also wrote to StorCloud– SLAC: wrote 3.2TB in 1649 files during BWC– Caltech: 6GB/sec with 20 nodes

Page 35: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Networking Overview• We had 22 10Gbits/s waves to the Caltech and

SLAC/FNAL booths. Of these:– 15 waves to the Caltech booth (from Florida (1), Korea/GLORIAD

(1), Brazil (1 * 2.5Gbits/s), Caltech (2), LA (2), UCSD, CERN (2), U Michigan (3), FNAL(2)).

– 7 x 10Gbits/s waves to the SLAC/FNAL booth (2 from SLAC, 1 from the UK, and 4 from FNAL).

• The waves were provided by Abilene, Canarie, Cisco (5), ESnet (3), GLORIAD (1), HOPI (1), Michigan Light Rail (MiLR), National Lambda Rail (NLR), TeraGrid (3) and UltraScienceNet (4).

Page 36: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Network Overview

Page 37: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Hardware (SLAC only)• At SLAC:

– 14 x 1.8Ghz Sun v20z (Dual Opteron)– 2 x Sun 3500 Disk trays (2TB of storage)– 12 x Chelsio T110 10Gb NICs (LR)– 2 x Neterion/S2io Xframe I (SR) – Dedicated Cisco 6509 with 4 x 4x10GB blades

• At SC|05:– 14 x 2.6Ghz Sun v20z (Dual Opteron)– 10 QLogic HBA’s for StorCloud Access– 50TB Storage at SC|05 provide by 3PAR (Shared with Caltech)– 12 x Neterion/S2io Xframe I NICs (SR)– 2 x Chelsio T110 NICs (LR)– Shared Cisco 6509 with 6 x 4x10GB blades

Page 38: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Hardware at SC|05

Page 39: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Software• BBCP ‘Babar File Copy’

– Uses ‘ssh’ for authentication– Multiple stream capable– Features ‘rate synchronisation’ to reduce byte retransmissions– Sustained over 9Gbps on a single session

• XrootD– Library for transparent file access (standard unix file functions)– Designed primarily for LAN access (transaction based protocol)– Managed over 35Gbit/sec (in two directions) on 2 x 10Gbps waves– Transferred 18TBytes in 257,913 files

• DCache– 20Gbps production and test cluster traffic

Page 40: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Last year (SC|04)

BWC Aggregate Bandwidth

Page 41: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Cumulative Data Transferred

Bandwidth Challenge period

Page 42: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Component Traffic

Page 43: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

SLAC-ESnet

FermiLab-HOPI

SLAC-ESnet-USNFNAL-UltraLight

UKLight

Out from booth

SLAC-FermiLab-UK Bandwidth Contributions

In to booth

Page 44: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

In to booth

Out from booth

ESnet routed

ESnet SDN layer2 via USN

Bandwidth Challenge period

SLAC Cluster Contributions

Page 45: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

SLAC/FNAL Booth

Aggregate

Mbp

s

Waves

Page 46: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Problems…• Managerial/PR

– Initial request for loan hardware took place 6 months in advance!– Lots and lots of paperwork to keep account of all loan equipment

• Logistical– Set up and tore down a pseudo production network and servers in

a space of week!– Testing could not begin until waves were alight

• Most waves lit day before challenge!– Shipping so much hardware not cheap!– Setting up monitoring

Page 47: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Problems…• Tried to configure hardware and software prior to show • Hardware

– NICS• We had 3 bad Chelsios (bad memory)• Xframe II’s did not work in UKLight’s Boston machines

– Hard-disks• 3 dead 10K disks (had to ship in spare)

– 1 x 4Port 10Gb blade DOA– MTU mismatch between domains– Router blade died during stress testing day before BWC!– Cables! Cables! Cables!

• Software– Used golden disks for duplication (still takes 30 minutes per disk to replicate!)– Linux kernels:

• Initially used 2.6.14, found sever performance problems compared to 2.6.12.– (New) Router firmware caused crashes under heavy load

• Unfortunately, only discovered just before BWC• Had to manually restart the affected ports during BWC

Page 48: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Problems• Most transfers were from memory to memory (Ramdisk

etc).– Local caching of (small) files in memory– Reading and writing to disk will be the next bottleneck to

overcome

Page 49: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Conclusion• Previewed the IT Challenges of the next generation Data

Intensive Science Applications (High Energy Physics, astronomy etc)– Petabyte-scale datasets– Tens of national and transoceanic links at 10 Gbps (and up)– 100+ Gbps aggregate data transport sustained for hours; We

reached a Petabyte/day transport rate for real physics data

• Learned to gauge difficulty of the global networks and transport systems required for the LHC mission– Set up, shook down and successfully ran the systems in < 1 week– Understood and optimized the configurations of various

components (Network interfaces, router/switches, OS, TCP kernels, applications) for high performance over the wide area network.

Page 50: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Conclusion• Products from this the exercise

– An optimized Linux (2.6.12 + NFSv4 + FAST and other TCP stacks) kernel for data transport; after 7 full kernel-build cycles in 4 days

– A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions.

– Extensions of Xrootd, an optimized low-latency file access application for clusters, across the wide area

– Understanding of the limits of 10 Gbps-capable systems under stress.– How to effectively utilize 10GE and 1GE connected systems to drive 10 gigabit

wavelengths in both directions.– Use of production and test clusters at FNAL reaching more than 20 Gbps of

network throughput.

• Significant efforts remain from the perspective of high-energy physics– Management, integration and optimization of network resources– End-to-end capabilities able to utilize these network resources.

This includes applications and IO devices (disk and storage systems)

Page 51: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge
Page 52: Stanford University, SLAC, NIIT, the Digital Divide & Bandwidth Challenge

Press and PR• 11/8/05 - Brit Boffins aim to Beat LAN speed record from vnunet.com• SC|05 Bandwidth Challenge SLAC Interaction Point.• Top Researchers, Projects in High Performance Computing Honored

at SC/05 ... Business Wire (press release) - San Francisco, CA, USA• 11/18/05 - Official Winner Announcement• 11/18/05 - SC|05 Bandwidth Challenge Slide Presentation• 11/23/05 - Bandwidth Challenge Results from Slashdot• 12/6/05 - Caltech press release• 12/6/05 - Neterion Enables High Energy Physics Team to Beat World

Record Speed at SC05 Conference CCN Matthews News Distribution Experts

• High energy physics team captures network prize at SC|05 from SLAC• High energy physics team captures network prize at SC|05 EurekaAlert!• 12/7/05 - High Energy Physics Team Smashes Network Record, from

Science Grid this Week.• Congratulations to our Research Partners for a New Bandwidth

Record at SuperComputing 2005, from Neterion.