Transcript
Page 1: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

1

End-2-End Network MonitoringWhat do we do ?

What do we use it for?

Richard Hughes-Jones

Many people are involved:

Page 2: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

2

Local NetworkMonitoring

Store & Analysisof Data (Access)

Access to current and historic dataand metrics via the Web, i.e. WP7NM Pages, access to metric forecasts

Backend LDAP script to fetch metricsMonitor process to push metrics

localLDAPServer

Grid Application access viaLDAP Schema to- monitoring metrics; - location of monitoring data.

PingER(RIPE TTB)

iperfUDPmon

rTPLNWS

etc

LDAPSchema

Grid AppsGridFTP

DataGRID WP7: Network Monitoring Architecturefor Grid Sites

Robin Tasker

Page 3: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

3

WP7 Network Monitoring Components

Ping Netmon UDPmon iPerf Ripe

Cronscript

plot

Table

LDAP

raw

control Cronscript

controlCronscript

plot

Table

LDAP

raw plot

Table

LDAP

raw

WEB Display AnalysisGrid BrokersPredictions

Web I/f

Scheduler

Tool

Clients

LDAP

raw

LDAP

raw

Page 4: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

4

Grid network monitoring architecture uses LDAP & R-GMA - DataGrid WP7 Central MySQL archive hosting all network metrics and GridFTP logging Probe Coordination Protocol deployed, scheduling tests MapCentre also provides site & node Fabric health checks

WP7 MapCentre: Grid Monitoring & Visualisation

Franck Bonnassieux

CNRS Lyon

Page 5: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

5

CERN – RAL UDP CERN – IN2P3 UDP

WP7 MapCentre: Grid Monitoring & Visualisation

CERN – RAL TCP CERN – IN2P3 TCP

Page 6: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

6

UK e-Science: Network Monitoring

Technology Transfer DataGrid WP7 M/c UK e-Science DL DataGrid WP7 M/c

Architecture

Page 7: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

7

UK e-Science: Network Problem Solving

24 Jan to 4 Feb 04

TCP iperf RAL to HEP

Only 2 sites >80 Mbit/s

RAL -> DL 250-300 Mbit/s

24 Jan to 4 Feb 04

TCP iperf DL to HEP

DL -> RAL ~80 Mbit/s

Page 8: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

8

UDP/IP packets sent between end systems Latency

Round trip times using Request-Response UDP frames Latency as a function of frame size

• Slope s given by:

• Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s)

• Intercept indicates processing times + HW latencies Histograms of ‘singleton’ measurements

UDP Throughput Send a controlled stream of UDP frames spaced at regular intervals Vary the frame size and the frame transmit spacing & measure:

• The time of first and last frames received

• The number packets received, lost, & out of order

• Histogram inter-packet spacing received packets

• Packet loss pattern

• 1-way delay

• CPU load

• Number of interrupts

Tools: UDPmon – Latency & Throughput

1

s

paths data dt

db

n bytesNumber of packets

Wait time time

Page 9: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

9

UDPmon: Example 1 Gigabit NIC Intel pro/1000

Latency

Throughput

Bus Activity

gig6-7 Intel pci 66 MHz 27nov02

0

200

400

600

800

1000

0 5 10 15 20 25 30 35 40Transmit Time per frame us

Recv

Wire

rate

M

bits

/s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

Motherboard: Supermicro P4DP6 Chipset: E7500 (Plumas) CPU: Dual Xeon 2 2GHz with 512k

L2 cache Mem bus 400 MHz

PCI-X 64 bit 66 MHz HP Linux Kernel 2.4.19 SMP MTU 1500 bytes

Intel PRO/1000 XT

Intel 64 bit 66 MHz

y = 0.0093x + 194.67

y = 0.0149x + 201.75

0

50

100

150

200

250

300

0 500 1000 1500 2000 2500 3000Message length bytes

Late

ncy u

s

64 bytes Intel 64 bit 66 MHz

0

100

200

300

400

500

600

700

800

900

170 190 210

Latency us

N(t)

512 bytes Intel 64 bit 66 MHz

0

100

200

300

400

500

600

700

800

170 190 210Latency us

N(t)

1024 bytes Intel 64 bit 66 MHz

0

100

200

300

400

500

600

700

800

190 210 230

Latency us

N(t)

1400 bytes Intel 64 bit 66 MHz

0

100

200

300

400

500

600

700

800

190 210 230

Latency us

N(t)

Receive Transfer

Send Transfer

Page 10: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

10

Tools: Trace-Rate Hop by hop measurements A method to measure the hop-by-hop capacity, delay, and loss

up to the path bottleneck

Not intrusive

Operates in a high-performance environment

Does not need cooperation of the destination Based on Packet Pair Method

Send sets of b2b packets with increasing time to live

For each set filter “noise” from rtt

Calculate spacing – hence bottleneck BW Robust regarding the presence of invisible nodes

Effect of the bottleneck on a packet pair. L is a packet size C is the capacity

Examples of parameters that are iteratively analysed to extract the capacity mode

Page 11: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

11

Tools: Trace-Rate Some Results Capacity measurements as function of load in Mbit/s from tests on the DataTAG Link:

Comparison of the number of packets required

Validated by simulations in NS-2 Linux implementations, working in a high-performance environment Research report: http://www.inria.fr/rrrt/rr-4959.html Research Paper: ICC2004 : International Conference on Communications, Paris,

France, June 2004. IEEE Communication Society.

Page 12: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

12

Network Monitoring as a Tool to study:

Protocol Behaviour Network Performance Application Performance

Tools include: web100 tcpdump Output from the test tool:

• UDPmon, iperf, … Output from the application

• Gridftp, bbcp, apache

Page 13: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

13

Protocol Performance: RDUDP Monitoring from Data Moving Application & Network Test Program DataTAG WP3 work Test Setup:

Path: Ams-Chi-Ams Force10 loopback

Moving data from DAS-2 cluster with RUDP – UDP based Transport

Apply 11*11 TCP background streams from iperf

Conclusions RUDP performs well

It does Back off and share BW

Rapidly expands when BW free

Hans Blom

Page 14: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

14

Performance of the GÉANT Core Network Test Setup:

Supermicro PC in: London & Amsterdam GÉANT PoP

Smartbits in: London & Frankfurt GÉANT PoP

Long link : UK-SE-DE2-IT-CH-FR-BE-NL

Short Link : UK-FR-BE-NL

Network Quality Of Service LBE, IP Premium

High-Throughput Transfers Standard and advanced TCP stacks

Packet re-ordering effects

Flow:BE BG: 60% BE 1.4Gbit + 40% LBE 780Mbit

0

5000

10000

15000

20000

25000

30000

35000

40000

0 50 100 150Packet Jitter us

Fre

que

ncy

flow:IPP Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

0

10000

20000

30000

40000

50000

60000

0 50 100 150Packet Jitter us

1-w

ay late

ncy u

s

Flow:IPP Background: none

0

50000

100000

150000

200000

250000

0 50 100 150Packet Jitter us

Fre

quency

Jitter for IPP and BE flows under load

Flow: BE BG:60+40% BE+LBE Flow: IPP BG:60+40% BE+LBE Flow: IPP none

Page 15: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

15

Tests GÉANT Core: Packet re-ordering

Effect of LBE background Amsterdam-London BE Test flow Packets at 10 µs – line speed 10,000 sent Packet Loss ~ 0.1%

Re-order Distributions:

UDP 1472 bytes NL-UK-lbexxx_7nov03

02468

101214161820

2 2.2 2.4 2.6 2.8 3 3.2Total Offered Rate Gbit/s

% O

ut o

f ord

er

hstcpStandard TCP line speed90% line speed

Packet re-order 1472 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us

020000400006000080000

100000120000140000160000180000200000

1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets

0 % lbe

10 % lbe

20 % lbe

30 % lbe

40 % lbe

50 % lbe

60 % lbe

70 % lbe

80 % lbe

Packet re-order 1400 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us

0500

100015002000250030003500400045005000

1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets

0 % lbe

10 % lbe

20 % lbe

30 % lbe

40 % lbe

50 % lbe

60 % lbe

70 % lbe

80 % lbe

Page 16: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

16

Application Throughput + Web100 2Gbyte file transferred RAID0 disks Web100 output every 10 ms Gridftp See alternate 600/800 Mbit and zero

MB - NG

Apachie web server + curl-based client See steady 720 Mbit

Page 17: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

17

1472 byte Packets man -> JIVE FWHM 22 µs (B2B 3 µs )

VLBI Project: Throughput Jitter 1-way Delay Loss

1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03

0

2000

4000

6000

8000

10000

0 20 40 60 80 100 120 140

Jitter us

N(t

)

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000Packet No.

1-w

ay d

ela

y u

s

1-way Delay – note the packet loss (points with zero 1 –way delay)

Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40Spacing between frames us

Recv W

ire r

ate

Mbits/s

Gnt5-DwMk5

DwMk5-Gnt5

1472 byte Packets Manchester -> Dwingeloo JIVE

Packets Loss distribution Prob. Density Function: P(t) = λ e-λt Mean λ = 2360 / s [426 µs]

packet loss distribution 12b bin=12us

0

10

20

30

40

50

60

70

80

12

72

132

192

252

312

372

432

492

552

612

672

732

792

852

912

972

Time between lost frames (us)

Num

ber

in B

in

Measured

Poisson

Page 18: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

18

Passive Monitoring

Time-series data from Routers and Switches Immediate but usually historical- MRTG

Usually derived from SNMP

Miss-configured / infected / misbehaving End Systems (or Users?)Note Data Protection Laws & confidentialitySite MAN and Back-bone topology & load

Help to user/sysadmin to isolate problem – eg low TCP transfer

Essential for Proof of Concept tests or Protocol testing

Trends used for capacity planning

Control of P2P traffic

Page 19: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

19

Users: The Campus & the MAN [1]

NNW – to – SJ4 Access 2.5 Gbit PoS Hits 1 Gbit 50 %

Man – NNW Access 2 * 1 Gbit Ethernet

Pete White

Pat Myers

Page 20: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

20

0

50

100

150

200

250

24/01/200400:00

25/01/200400:00

26/01/200400:00

27/01/200400:00

28/01/200400:00

29/01/200400:00

30/01/200400:00

31/01/200400:00

Tra

ffic

Mbi

t/s

In

out

Users: The Campus & the MAN [2]

LMN to site 1 Access 1 Gbit Ethernet LMN to site 2 Access 1 Gbit Ethernet

0

100

200

300

400

500

600

700

800

900

24/01/200400:00

25/01/200400:00

26/01/200400:00

27/01/200400:00

28/01/200400:00

29/01/200400:00

30/01/200400:00

31/01/200400:00

Tra

ffic

Mb

it/s

In from SJ4

Out to SJ4ULCC-JANET traffic 30/1/2004

0

100

200

300

400

500

600

700

800

00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12 21:36 00:00

Tra

ffic

Mbi

t/s

in

out

0

50

100

150

200

250

300

350

24/01/200400:00

25/01/200400:00

26/01/200400:00

27/01/200400:00

28/01/200400:00

29/01/200400:00

30/01/200400:00

31/01/200400:00

Tra

ffic

Mbi

t/s

In site1

Out site1

Message:

Not a complaint

Continue to work with your network group

Understand the traffic levels

Understand the Network Topology

Page 21: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

21

VLBI Traffic Flows

Manchester – NetNorthWest - SuperJANET Access links Two 1 Gbit/s

Access links:SJ4 to GÉANT GÉANT to SurfNet

Only testing – Could be worse!

Page 22: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

22

Network Measurement Working Group

“A Hierarchy of Network Performance Characteristics for Grid Applications and Services”

Document defines terms & relations: Network characteristics Measurement methodologies Observation

Discusses Nodes & Paths For each Characteristic

Defines the meaning Attributes that SHOULD be included Issues to consider when making an observation

Status: Originally submitted to GFSG as Community Practice Document

draft-ggf-nmwg-hierarchy-00.pdf Jul 2003 Revised to Proposed Recommendation

http://www-didc.lbl.gov/NMWG/docs/draft-ggf-nmwg-hierarchy-02.pdf 7 Jan 04 Now in 60 day Public comment from 28 Jan 04 – 18 days to go.

Characteristic

Discipline

Capacity Length

QueueCapacity Utilized

Available Achievable

Bandwidth

Round-trip

One-way Jitter

Delay

Loss Pattern

Round-trip One-way

Loss

ForwardingPolicy

ForwardingTable

ForwardingWeight

Forwarding

Avail. PatternMTBF

Availability

Closeness

Others

Hoplist

GGF: Hierarchy Characteristics Document

Page 23: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

23

Request Schema: Ask for results / ask to make test Schema Requirements Document made

• Use DAMED style namese.g. path.delay.oneWay

• Send: Char. Time, Subject = node | pathMethodology, Stats

Response Schema: Interpret results Includes Observation environment

Much work in progress Common components Drafts almost done

2 (3) proof-of-concept implementations 2 implementations using XML-RPC by Internet2 SLAC Implementation in progress using Document /Literal by DL & UCL

skeleton publication

schema

include x include b include c

skeleton request schema

include x include y include z

src & dest

pool of common

components

method-ology

src & destsrc & dest

skeleton publication

schema

include x include b include c

skeleton publication

schema

include x include b include c

skeleton request schema

include x include y include z

skeleton request schema

include x include y include z

src & dest

pool of common

components

method-ology

src & dest

pool of common

components

method-ology

src & destsrc & dest

Network Monitoring

ServiceXML test request

XML tests results

GGF: Schemata for Network Measurements

Page 24: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

24

So What do we Use Monitoring for: A Summary

End2End Time SeriesThroughput UDP/TCP

Rtt

Packet loss

Passive MonitoringRouters Switches SNMP MRTG

Historical MRTG

Packet/Protocol Dynamics tcpdump

web100

Output from Application tools

Detect or X-check problem reports Isolate / determine a performance issue Capacity planning Publication of data: network “cost” for middleware

RBs for optimized matchmaking WP2 Replica Manager

Capacity planning SLA verification Isolate / determine throughput bottleneck – work

with real user problems Test conditions for Protocol/HW investigations

Protocol performance / development Hardware performance / development Application analysis

Input to middleware – eg gridftp throughput Isolate / determine a (user) performance issue Hardware / protocol investigations

Page 25: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

25

More Information Some URLs

DataGrid WP7 Mapcenter: http://ccwp7.in2p3.fr/wp7archive/

& http://mapcenter.in2p3.fr/datagrid-rgma/UK e-science monitoring: http://gridmon.dl.ac.uk/gridmon/MB-NG project web site: http://www.mb-ng.net/ DataTAG project web site: http://www.datatag.org/UDPmon / TCPmon kit + writeup:

http://www.hep.man.ac.uk/~rich/netMotherboard and NIC Tests: www.hep.man.ac.uk/~rich/net IEPM-BW site: http://www-iepm.slac.stanford.edu/bw

Page 26: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

26

Page 27: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

27

Network Monitoring to Grid Sites Network Tools Developed Using Network Monitoring as a Study Tool Applications & Network Monitoring – real users Passive Monitoring Standards – Links to GGF

Page 28: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

28

Data Flow: SuperMicro 370DLE: SysKonnect Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel 2.4.14

1400 bytes sent Wait 100 us ~8 us for send or receive Stack & Application overhead ~ 10 us / node

Send PCI

Receive PCI

~36 us

Send TransferSend CSR setup

Receive TransferPacket on Ethernet Fibre

Page 29: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

29

10 GigEthernet: Throughput 1500 byte MTU gives ~ 2 Gbit/s Used 16144 byte MTU max user length 16080 DataTAG Supermicro PCs Dual 2.2 GHz Xeon CPU FSB 400 MHz PCI-X mmrbc 512 bytes wire rate throughput of 2.9 Gbit/s

SLAC Dell PCs giving a Dual 3.0 GHz Xeon CPU FSB 533 MHz PCI-X mmrbc 4096 bytes wire rate of 5.4 Gbit/s

CERN OpenLab HP Itanium PCs Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz PCI-X mmrbc 4096 bytes wire rate of 5.7 Gbit/s

an-al 10GE Xsum 512kbuf MTU16114 27Oct03

0

1000

2000

3000

4000

5000

6000

0 5 10 15 20 25 30 35 40Spacing between frames us

Rec

v W

ire

rate

Mb

its/

s

16080 bytes 14000 bytes 12000 bytes 10000 bytes 9000 bytes 8000 bytes 7000 bytes 6000 bytes 5000 bytes 4000 bytes 3000 bytes 2000 bytes 1472 bytes

Page 30: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

30

Tuning PCI-X: Variation of mmrbc IA32

mmrbc1024 bytes

mmrbc2048 bytes

mmrbc4096 bytes

mmrbc512 bytes

CSR Access

PCI-X Sequence

Data Transfer

Interrupt & CSR Update

16080 byte packets every 200 µs Intel PRO/10GbE LR Adapter

PCI-X bus occupancy vs mmrbc

Plot: Measured times

Times based on PCI-X times from the logic analyser

Expected throughput

0

5

10

15

20

25

30

35

40

45

50

0 1000 2000 3000 4000 5000Max Memory Read Byte Count

PC

I-X

Tra

nsfe

r tim

e us

0

1

2

3

4

5

6

7

8

9P

CI-

X T

rans

fer

rate

Gbi

t/s

Measured PCI-X transfer time usexpected time usrate from expected time Gbit/s Max throughput PCI-X

Page 31: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

31

10 GigEthernet at SC2003 BW Challenge Three Server systems with 10 GigEthernet NICs Used the DataTAG altAIMD stack 9000 byte MTU Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to:

Pal Alto PAIX rtt 17 ms , window 30 MB Shared with Caltech booth 4.37 Gbit hstcp I=5% Then 2.87 Gbit I=16% Fall corresponds to 10 Gbit on link

3.3Gbit Scalable I=8% Tested 2 flows sum 1.9Gbit I=39%

Chicago Starlight rtt 65 ms , window 60 MB Phoenix CPU 2.2 GHz 3.1 Gbit hstcp I=1.6%

Amsterdam SARA rtt 175 ms , window 200 MB Phoenix CPU 2.2 GHz

4.35 Gbit hstcp I=6.9% Very Stable Both used Abilene to Chicago

10 Gbits/s throughput from SC2003 to PAIX

0

1

2

3

4

5

6

7

8

9

10

11/19/0315:59

11/19/0316:13

11/19/0316:27

11/19/0316:42

11/19/0316:56

11/19/0317:11

11/19/0317:25 Date & Time

Throughput

Gbits/s

Router to LA/PAIXPhoenix-PAIX HS-TCPPhoenix-PAIX Scalable-TCPPhoenix-PAIX Scalable-TCP #2

10 Gbits/s throughput from SC2003 to Chicago & Amsterdam

0

1

2

3

4

5

6

7

8

9

10

11/19/0315:59

11/19/0316:13

11/19/0316:27

11/19/0316:42

11/19/0316:56

11/19/0317:11

11/19/0317:25 Date & Time

Throughput

Gbits/s

Router traffic to Abilele

Phoenix-Chicago

Phoenix-Amsterdam

Page 32: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

32

Summary & Conclusions

Intel PRO/10GbE LR Adapter and driver gave stable throughput and worked well

Need large MTU (9000 or 16114) – 1500 bytes gives ~2 Gbit/s

PCI-X tuning mmrbc = 4096 bytes increase by 55% (3.2 to 5.7 Gbit/s) PCI-X sequences clear on transmit gaps ~ 950 ns Transfers: transmission (22 µs) takes longer than receiving (18 µs) Tx rate 5.85 Gbit/s Rx rate 7.0 Gbit/s (Itanium) (PCI-X max 8.5Gbit/s)

CPU load considerable 60% Xenon 40% Itanium BW of Memory system important – crosses 3 times! Sensitive to OS/ Driver updates

More study needed

Page 33: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

33

PCI Activity: Read Multiple data blocks 0 wait Read 999424 bytes Each Data block:

Setup CSRs

Data movement

Update CSRs

For 0 wait between reads:

Data blocks ~600µs longtake ~6 ms

Then 744µs gap PCI transfer rate 1188Mbit/s

(148.5 Mbytes/s) Read_sstor rate 778 Mbit/s

(97 Mbyte/s) PCI bus occupancy: 68.44% Concern about Ethernet Traffic 64

bit 33 MHz PCI needs ~ 82% for 930 Mbit/s Expect ~360 Mbit/s

Data transfer

CSR AccessPCI Burst 4096 bytes

Data Block131,072 bytes

Page 34: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

34

PCI Activity: Read Throughput

Flat then 1/t dependance ~ 860 Mbit/s for Read blocks >=

262144 bytes

CPU load ~20% Concern about CPU load needed

to drive Gigabit link

Page 35: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

35

BaBar Case Study: RAID Throughput & PCI Activity 3Ware 7500-8 RAID5 parallel EIDE 3Ware forces PCI bus to 33 MHz BaBar Tyan to MB-NG SuperMicro

Network mem-mem 619 Mbit/s

Disk – disk throughput bbcp 40-45 Mbytes/s (320 – 360 Mbit/s)

PCI bus effectively full!

Read from RAID5 Disks Write to RAID5 Disks

Page 36: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

36

BaBar: Serial ATA Raid Controllers 3Ware 66 MHz PCI

Read Throughput raid5 4 3Ware 66MHz SATA disk

0

200

400

600

800

1000

1200

1400

1600

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200

ICP 66 MHz PCI

Write Throughput raid5 4 3Ware 66MHz SATA disk

0

200

400

600

800

1000

1200

1400

1600

1800

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 516readahead max 1200

Read Throughput raid5 4 ICP 66MHz SATA disk

0

100

200

300

400

500

600

700

800

900

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200

Write Throughput raid5 4 ICP 66MHz SATA disk

0

200

400

600

800

1000

1200

1400

1600

0 200 400 600 800 1000 1200 1400 1600 1800 2000

File size MBytes

Mb

it/s

readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200

Page 37: End-2-End Network Monitoring What do we do ?   What do we use it for?

GNEW2004 CERN March 2004R. Hughes-Jones Manchester

37

Measure the time between lost packets in the time series of packets sent.

Lost 1410 in 0.6s Is it a Poisson process? Assume Poisson is

stationary λ(t) = λ Use Prob. Density Function:

P(t) = λ e-λt

Mean λ = 2360 / s[426 µs]

Plot log: slope -0.0028expect -0.0024

Could be additional process involved

VLBI Project: Packet Loss Distributionpacket loss distribution 12b bin=12us

0

10

20

30

40

50

60

70

80

12 72 132

192

252

312

372

432

492

552

612

672

732

792

852

912

972

Time between lost frames (us)

Num

ber

in B

in

Measured

Poisson

packet loss distribution 12b

y = 41.832e-0.0028x

y = 39.762e-0.0024x

1

10

100

0 500 1000 1500 2000

Time between frames (us)

Num

ber

in B

in


Recommended