23
Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004 R. Hughes-Jones Manchester Networking for ATLAS Remote Farms Richard Hughes-Jones The University of Manchester DataGrid WP7 – Dante Tests on the GÉANT Core End-2-End Measurements from the 4 th Year VLBI Project at Manchester New TCP stacks – the effect on throughput Some Simple Network Tests CERN-Manchester

Networking for ATLAS Remote Farms

  • Upload
    laurel

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Networking for ATLAS Remote Farms. Richard Hughes-Jones The University of Manchester. DataGrid WP7 – Dante Tests on the G ÉANT Core End-2-End Measurements from the 4 th Year VLBI Project at Manchester New TCP stacks – the effect on throughput Some Simple Network Tests CERN-Manchester. - PowerPoint PPT Presentation

Citation preview

Page 1: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Networking for ATLAS Remote Farms

Richard Hughes-JonesThe University of Manchester

DataGrid WP7 – Dante Tests on the GÉANT CoreEnd-2-End Measurements from the 4th Year VLBI Project at ManchesterNew TCP stacks – the effect on throughputSome Simple Network Tests CERN-Manchester

Page 2: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

DataGrid WP7 – Dante Tests on the GÉANT Core

Set-up

Supermicro PC in: London GEANT PoP Amsterdam GEANT PoP

Smartbits in: London GEANT PoP Frankfurt GEANT PoP

Long link UK-SE-DE2-IT-CH-FR-BE-NL

Short Link UK-FR-BE-NL

Page 3: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: UDP throughput

UDP Throughput London-Amsterdam Available BW to packet on wire Then 1/t Wire rate 998 Mbit/s

for packets > 1400 bytes

Packet Loss None for large packets

Dips in BW lined to packet loss SysKonnect NIC int. per packet CPU load important

uk-nl_20tg4-hs-w100_01Oct03

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40Spacing between frames us

Rec

v W

ire r

ate

Mbi

ts/s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

0102030405060708090

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acke

t los

s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

Page 4: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: Packet re-ordering

Effect of Packet size London-Amsterdam Packets at 10 µs – line speed 10,000 sent Packet Loss ~ 0.1%

Re-order Distribution

Packet re-order uk-nl 10,000 BE sent wait 10 us 01 Oct 03

0

5

10

15

20

25

30

0 500 1000 1500Packet size bytes

Out

of o

rder

% 0

10

20

30

1400 1401 1402 1403 1404Packet size bytes

Ou

t o

f o

rde

r %

Packet re-order uk-nl 10,000 sent wait 10 us

0

100

200

300

400

500

0 1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets 1400 bytes

1401 bytes

1402 bytes

Page 5: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: Packet re-ordering

Effect of LBE background Amsterdam-London BE Test flow Packets at 10 µs – line speed 10,000 sent Packet Loss ~ 0.1%

Re-order Distributions:

UDP 1472 bytes NL-UK-lbexxx_7nov03

02468

101214161820

2 2.2 2.4 2.6 2.8 3 3.2Total Offered Rate Gbit/s

% O

ut o

f ord

er

hstcpStandard TCP line speed90% line speed

Packet re-order 1472 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us

020000400006000080000

100000120000140000160000180000200000

1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets

0 % lbe

10 % lbe

20 % lbe

30 % lbe

40 % lbe

50 % lbe

60 % lbe

70 % lbe

80 % lbe

Packet re-order 1400 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us

0500

100015002000250030003500400045005000

1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets

0 % lbe

10 % lbe

20 % lbe

30 % lbe

40 % lbe

50 % lbe

60 % lbe

70 % lbe

80 % lbe

Page 6: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: Packet Jitter Amsterdam-London BE Test flow Packet spacing 80 µs

IPPremium Test flow

Flow: BE Background: none

0

10000

20000

30000

40000

50000

0 20 40 60 80 100 120 140

Latency us

Fre

qu

ency

Flow:IPP Background: none

0

50000

100000

150000

200000

250000

0 20 40 60 80 100 120 140Packet Jitter us

Fre

qu

ency

Flow:BE Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

0

5000

10000

15000

20000

25000

30000

35000

40000

0 20 40 60 80 100 120 140

Packet Jitter us

Fre

qu

ency

BE Test flow + Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

flow:IPP Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

0

10000

20000

30000

40000

50000

60000

0 20 40 60 80 100 120 140Packet Jitter us

1-w

ay l

aten

cy u

s

IPPremium Test flow + Background

Page 7: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: 1-way Delay Amsterdam-London IPPremium Test flow Packet spacing 80 µs

BE Test flow + Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

BE Test flow + Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

Flow:IPP Background: none

11220

11240

11260

11280

11300

11320

11340

0 2000 4000 6000 8000 10000Packet No.

1-w

ay la

tenc

y us

Flow:BE Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

11000112001140011600118001200012200124001260012800

0 2000 4000 6000 8000 10000Packet No.

1-w

ay la

tenc

y us

Flow:IPP Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

1120011250113001135011400114501150011550116001165011700

0 2000 4000 6000 8000 10000Packet No.

1-w

ay la

tenc

y us

Page 8: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

VLBI Project: Test Topology

SuperJANET4

Jodrell

Manchester

SURFnet

JIVEDwingaloo

Adam MathewsSteve O’TooleUniv of Manchester

Page 9: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Gnt5-DwMk5 11Nov03-1472 bytes

0

2

4

6

8

10

12

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acket

loss

Gnt5-DwMk5

DwMk5-Gnt5

Manchester to Dwingeloo 2.0G Hz Xeon 1.2 GHz PIII

Re-ordering vs Offered Load

VLBI Project: Throughput

Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40Spacing between frames us

Recv W

ire r

ate

Mbits/s

Gnt5-DwMk5

DwMk5-Gnt5

Gnt5-DwMk5 11Nov03 1472 bytes

020406080

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% K

erne

l S

ende

r

Gnt5-DwMk5 11Nov03 1472 bytes

020406080

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% K

erne

l R

ecei

ver

Page 10: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

1472 byte Packets man -> JIVE FWHM 22 µs (B2B 3 µs )

VLBI Project: Jitter & 1-way Delay

1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03

0

2000

4000

6000

8000

10000

0 20 40 60 80 100 120 140

Jitter us

N(t

)

1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03

1

10

100

1000

10000

0 20 40 60 80 100 120 140

Jitter us

N(t

)

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

2000 2100 2200 2300 2400 2500 2600 2700 2800 2900 3000Packet No.

1-w

ay

de

lay

us

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000Packet No.

1-w

ay d

elay

us

1-way Delay – note the packet loss (points with 0 –way delay)

Page 11: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Aggregated Variance Method Divide time series length N into

blocks of size m Calc mean of each section Xm(k)

k= 1 … N/m Calc variance VXm of these Xm(k) Vary m size of the blocks

Plot on log-log & fit slope β Hurst parameter H

β = 2H -2 Measure:

β = -0.355 which gives H 0.822 H =1 no long range dependence

VLBI Project: Packet Loss – Long Range Effects?

y = -0.355x + 2.8826

0

0.5

1

1.5

2

2.5

3

3.5

4

0 0.5 1 1.5 2 2.5 3sub-sample size Log10( m )

Ag

gri

ga

te-v

ari

an

ce L

og

10

( X

(m)

)

Page 12: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Traffic Flows Manchester – NetNorthWest - SuperJANET Access links

Two 1 Gbit/s

Access links:SJ4 to GÉANT GÉANT to SurfNet

Page 13: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

High Performance TCP – DataTAG Different TCP stacks tested on the DataTAG Network 128 ms round trip time Drop 1 in 106

High-SpeedRapid recovery

ScalableVery fast recovery

StandardRecovery would

take ~ 10 mins

Page 14: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Drop 1 in 25,000 Rtt 6.2 ms Recover in 1.6 s

High Performance TCP – MB-NG

Standard HighSpeed Scalable

Page 15: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Some Network Tests TCP Request – Response

Zero stats OK done

Send statistics:CPU load & no. int1-way delay

Send event data

Request-Response time (Histogram)

Request event

Get remote statistics

●●● ●●●

Page 16: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Lab Test: TCP Request-Response Histograms PC – router – PC BE Test flow Request spacing 0 µs

Request spacing 10 ms

0.5M bytes man02-3_7may04

0

200

400

600

800

1000

4280 4300 4320 4340 4360 4380 4400 4420 4440 4460

Latency us

N(t

)

0.5M bytes w 10ms man02-3_7may04

0

200

400

600

800

4280 4300 4320 4340 4360 4380 4400 4420 4440 4460Latency us

N(t

)

1.0M bytes man02-3_7may04

0

200

400

600

800

1000

8580 8600 8620 8640 8660 8680 8700 8720 8740 8760

Latency us

N(t

)

2.0 M bytes man02-3_7may04

0

200

400

600

800

1000

17080 17100 17120 17140 17160 17180 17200 17220 17240 17260

Latency us

N(t

)

1.0 M bytes w 10ms man02-3_7may04

0

200

400

600

800

1000

8580 8600 8620 8640 8660 8680 8700 8720 8740 8760

Latency us

N(t

)2.0 M bytes w 10ms man02-3_7may04

0

200

400

600

800

1000

17080 17100 17120 17140 17160 17180 17200 17220 17240 17260

Latency us

N(t

)

Page 17: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Man-CERN: TCP Request-Response Latency DataTAG PC – backup link

BE Tests Request spacing 0 µs Win size 2.5Mbytes

Compare with UDP latency Large differences

Rtt of 20 msdelay*bw = 2.5 Mbytes

1Mbyte data = 690 pkts interesting bursts !

w05gva-gnt5_7May04_TCP

0

50000

100000

150000

200000

250000

300000

0 20000 40000 60000 80000 100000 120000 140000 160000

Message length bytes

Lat

ency

us

req-resp UDP latency us

ave time

w05gva-gnt5_7May04_TCP

0

100000

200000

300000

400000

500000

600000

0 20000 40000 60000 80000 100000 120000 140000 160000

Message length bytes

La

ten

cy

us

ave time

min time

max time

Page 18: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Man-CERN: UDP Throughput & Packet Loss DataTAG PC – backup link BE Tests Throughput

Packet loss

w05gva-gnt5_7May04_UDP

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40Spacing between frames us

Rec

v W

ire

rate

Mb

its/

s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

w05gva-gnt5_7May04_UDP

0

2

4

6

8

10

12

14

16

18

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acke

t lo

ss

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

Page 19: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Traffic Flows Manchester – NetNorthWest - SuperJANET Access links

Link to PC in M/c Access links: 1 GE Man to NNW

Total Man to NNW

NNW to SuperJANET4

Page 20: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Page 21: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Divide time series of packets into 1000 slices of 50 packets

Total lost packets 1410 Average number / slice = 1.4

Calc Poisson Probability P(n, µ) = µ n e -µ

n!

Curves close but not exact Could be more than 1 process

VLBI Project: Packet Loss – Is it Poisson?

0

50

100

150

200

250

300

350

400

0 5 10 15n num lost in sub-sample

N(n

)

run12b

1

1.3

1.4

1.8

Page 22: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Traffic QoS Classes on GÉANT Backbone

Normal Traffic

Normal Traffic +

Less Than Best Effort 2.0 Gbit/s

Normal Traffic +

Radio Astronomy Data 500 Mbit/s

Normal Traffic +

Radio Astronomy Data +

Less Than Best Effort 2.0 Gbit/s

Max Throughput on 2.5 G PoS

Page 23: Networking  for ATLAS Remote Farms

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Some Measurements made during ER2002

No LBE

0

2

4

6

8

10

12

14

16

18

20

0 20 40 60 80 100 120 140 160 180 200Transfer number

No.

Out

of

ord

er

0

5000

10000

15000

20000

25000

No

. L

ost

num_badorder

num_lost

With 1.8Gbit LBE

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

0 20 40 60 80 100 120 140 160Transfer number

No.

Out

of

ord

er

0

5000

10000

15000

20000

25000

No

. L

ost

num_badorder

num_lost