23
PIPE Dreams PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

PIPE DreamsPIPE DreamsPIPE DreamsPIPE Dreams

Trouble Shooting Network Performance for Production Science Data Grids

Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Page 2: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

AbstractAbstractAbstractAbstract

The vision of science grids allocating resources to analyze huge quantities of HENP data clearly depends on reliable network performance. Tools developed at SLAC in conjunction with the Internet2 PIPES project will help to ensure this. In this talk, these tools will be discussed and the procedure for publishing performance data, in particular using the Globus toolkit's MDS and web services will be reviewed. The subsequent analysis and trouble-shooting methodology will be discussed with real world examples from the particle physics data grid (PPDG) and the European data grid (EDG).

Page 3: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

OverviewOverviewOverviewOverview

• What is the problem ?

• What is PIPES ?

• Network performance monitoring

• Problem identification

Page 4: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Resource BrokerResource Broker

FarmFarm

FarmFarm

FarmFarm

DataData

DataData

DataData

requestorrequestor

The Network

Network Monitoring for the GridNetwork Monitoring for the GridNetwork Monitoring for the GridNetwork Monitoring for the Grid

• The Data Grid consists of many components that must interoperate

requestorrequestor

Page 5: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Resource BrokerResource Broker

Farm

Farm

Farm

DataData

DataData

DataData

requestorrequestor

The Network

Allocate ResourcesAllocate ResourcesAllocate ResourcesAllocate Resources

• The resource broker must be fully informed

• Measurement is required !

requestorrequestor

12% pkt loss

OC4880% Utilization

Page 6: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

What is PIPES ?What is PIPES ?What is PIPES ?What is PIPES ?

• Internet2

• End-to-end performance initiative

• PI Performance Evaluation System (PIPES)

• PIPES Monitoring Platform (PMP)

• Overlap with goals of HENP

• Tremendous resources

Page 7: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

IEPM-BWIEPM-BWIEPM-BWIEPM-BW

• Package developed at SLAC– Measurement Engine

• Iperf, bbftp, bbcp, ping, traceroute• Abwe, owamp, udpmon, gridftp

– Job Manager– Data Storage and data server– Analysis Engine

Page 8: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

SNV

SLAC

CHI

ESnet

NY

Stanford

CalREN

NERSC

LANL

JLAB

TRIUMF

KEK

Abilene

SLAC

SNV

FNAL

ANL

NIK

HEF

CERN

IN2P3

CERN

CALTECH

SDSC

BNL

JAnet

HSTN

SEA

ATLCLV

IPLS

RAL

UCL

UManc

DLNNW

NY

RiceUT

Dallas

NCSA UM

ichI2

SOX

UFL

APAN

RIKENINFN-Roma

INFN-Milan

CESnet

APAN

Geant

EDG

PPD

G/G

riP

hyN

Monitoring S

ite

ORNL

Stanford

UTAH

DNVR

ORNL

NASAWASH

Imperial

INFN-Padua

Page 9: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

SLAC

Manchester

Bristol

Dresden

IN2P3

RAL

Stanford

Calren

Abilene

Renater

DFN

Janet

NNW

TVN

SWERN

ESnet

BaBar Grid

Geant

622Mbps

2.5 Gbps

1 Gbps

10 Gbps

Page 10: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Throughput from SLAC to RAL between May 2002 and February 2003

0

50000

100000

150000

200000

250000

5/13

/200

2

5/27

/200

2

6/10

/200

2

6/24

/200

2

7/8/

2002

7/22

/200

2

8/5/

2002

8/19

/200

2

9/2/

2002

9/16

/200

2

9/30

/200

2

10/1

4/20

02

10/2

8/20

02

11/1

1/20

02

11/2

5/20

02

12/9

/200

2

12/2

3/20

02

1/6/

2003

1/20

/200

3

2/3/

2003

2/17

/200

3

iperf

bbcpmem

bbcpdisk

bbftp

Page 11: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Problem IdentificationProblem IdentificationProblem IdentificationProblem Identification

• Typical Scenario– User complains file transfer is slow– Net admin runs ping, traceroute, iperf test– Complain to upstream provider

• Proactive– What do we mean by throughput?– How do we know there was a performance

hit?– Our approach is diurnal changes

Page 12: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003
Page 13: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

AlarmsAlarms

• Too much to keep track of

• Rather not wait for complaints

• Automated Alarms

• Rolling average à la RIPE-TT– May not be the best approach

• AMP Automated Detection System

Page 14: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003
Page 15: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003
Page 16: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

LimitationsLimitationsLimitationsLimitations

• Could be over an hour before alarm is generated

• More frequent measurements impact the network and measurements overlap

• Low impact tools allow finer grained measurement– Use NWS multi-variate method– Use SCIDAC ABwE tool– Use PingER, OWAMP

Page 17: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Available Bandwidth Estimate between SLAC and Caltech in February 2003

0

20

40

60

80

100

120

140

160

180

200

2/25/2003 0:00 2/25/2003 12:00 2/26/2003 0:00 2/26/2003 12:00 2/27/2003 0:00

Ban

dw

idth

in

Mb

ps

Page 18: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

PublishingPublishingPublishingPublishing

• Many monitoring projects, publish data to allow them to inter-operate

• MDS– EDG NM Schema

• Web Services– GLUE NE Schema

• GGF NMWG– Hierarchy Doc– Tools Doc

./get_data2003 3 18 6 1 41 1.61 1.601 1.62 0

Page 19: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Net RatNet RatNet RatNet Rat

• Alarm System– Multiple tools– Multiple measurement points– Trigger further measurements– Cross reference off site stats

• Informant database

• No measurement is ‘authoritative’– Cannot even believe a measurement

Page 20: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

LogLogLogLog

03/20/2003 20:13:46 ALARM pcgiga throughput=305.224 ctresh=512.95 athresh=312.9103/20/2003 20:13:48 TRACE no change in route detected03/20/2003 20:16:07 CALM Throughput within acceptable limits. ALARM CANCELLED

Page 21: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

Toward a Monitoring InfrastructureToward a Monitoring InfrastructureToward a Monitoring InfrastructureToward a Monitoring Infrastructure

• MAGGIE– Measurement and Analysis package

built on NIMI/Akenti

• EDEE– production-quality Data Grid for Europe

Page 22: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

More InformationMore InformationMore InformationMore Information

• IEPM Home Page• IEPM-BW• I2 E2E and PIPES• RIPE-TT• AMP Automated Event Detection• NWS• ABWE

Page 23: PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003

EndEndEndEnd

This talk made possible by the IEPM team at SLAC (Les Cottrell, Connie Logg, Jiri Navratil, Jerrod Williams, Fabrizio Coccetti), and the many developers and maintainers around the world.