Upload
sera
View
28
Download
0
Embed Size (px)
DESCRIPTION
Protocols Working with 10 Gigabit Ethernet. Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks”. Introduction to Measurements 10 GigE on SuperMicro X7DBE 10 GigE on SuperMicro X5DPE-G2 10 GigE and TCP – Monitor with web100 disk writes - PowerPoint PPT Presentation
Citation preview
CALICE, Mar 2007, R. Hughes-Jones Manchester1
Protocols
Working with 10 Gigabit Ethernet
Richard Hughes-Jones The University of Manchester
www.hep.man.ac.uk/~rich/ then “Talks”
CALICE, Mar 2007, R. Hughes-Jones Manchester2
Introduction to Measurements 10 GigE on SuperMicro X7DBE 10 GigE on SuperMicro X5DPE-G2 10 GigE and TCP – Monitor with web100 disk writes 10 GigE and Constant Bit Rate program UDP + memory access
CALICE, Mar 2007, R. Hughes-Jones Manchester3
UDP/IP packets sent between back-to-back systems Similar processing to TCP/IP but no flow control & congestion avoidance algorithms
Latency Round trip times using Request-Response UDP frames Latency as a function of frame size
Slope s given by:
Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s) Intercept indicates processing times + HW latencies
Histograms of ‘singleton’ measurements UDP Throughput
Send a controlled stream of UDP frames spaced at regular intervals Vary the frame size and the frame transmit spacing & measure:
The time of first and last frames receivedThe number packets received, lost, & out of orderHistogram inter-packet spacing received packetsPacket loss pattern1-way delayCPU loadNumber of interrupts
Udpmon: Latency & Throughput Measurements
Tells us about: Behavior of the IP stack The way the HW operates Interrupt coalescence
Tells us about: Behavior of the IP stack The way the HW operates Capacity & Available throughput of
the LAN / MAN / WAN
1
s
paths data dt
db
CALICE, Mar 2007, R. Hughes-Jones Manchester4
Throughput Measurements
UDP Throughput with udpmon Send a controlled stream of UDP frames spaced at regular intervals
n bytes
Number of packets
Wait timetime
Zero stats OK done
●●●
Get remote statistics Send statistics:No. receivedNo. lost + loss patternNo. out-of-orderCPU load & no. int1-way delay
Send data frames at regular intervals
●●●
Time to send Time to receive
Inter-packet time(Histogram)
Signal end of testOK done
Time
Sender Receiver
CALICE, Mar 2007, R. Hughes-Jones Manchester5
High-end Server PCs
Boston/Supermicro X7DBE Two Dual Core Intel Xeon Woodcrest 5130
2 GHz Independent 1.33GHz FSBuses
530 MHz FD Memory (serial) Parallel access to 4 banks
Chipsets: Intel 5000P MCH – PCIe & MemoryESB2 – PCI-X GE etc.
PCI 3 8 lane PCIe buses 3* 133 MHz PCI-X
2 Gigabit Ethernet SATA
CALICE, Mar 2007, R. Hughes-Jones Manchester6
10 GigE Back2Back: UDP Latency Motherboard: Supermicro X7DBE Chipset: Intel 5000P MCH CPU: 2 Dual Intel Xeon 5130
2 GHz with 4096k L2 cache Mem bus: 2 independent 1.33 GHz PCI-e 8 lane Linux Kernel 2.6.20-web100_pktd-plus Myricom NIC 10G-PCIE-8A-R Fibre myri10ge v1.2.0 + firmware v1.4.10
rx-usecs=0 Coalescence OFF MSI=1 Checksums ON tx_boundary=4096
MTU 9000 bytes
Latency 22 µs & very well behaved Latency Slope 0.0028 µs/byte B2B Expect: 0.00268 µs/byte
Mem 0.0004 PCI-e 0.00054 10GigE 0.0008 PCI-e 0.00054 Mem 0.0004
gig6-5_Myri10GE_rxcoal=0
y = 0.0028x + 21.937
0
10
20
30
40
50
60
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Message length bytes
La
ten
cy
us
64 bytes gig6-5
0
2000
4000
6000
8000
10000
12000
0 20 40 60 80
Latency us
N(t
)
8900 bytes gig6-5
0
1000
2000
3000
4000
5000
6000
0 20 40 60 80Latency us
N(t
)
3000 bytes gig6-5
0
2000
4000
6000
8000
10000
12000
0 20 40 60 80Latency us
N(t
)
Histogram FWHM ~1-2 us
CALICE, Mar 2007, R. Hughes-Jones Manchester7
10 GigE Back2Back: UDP Throughput Kernel 2.6.20-web100_pktd-plus Myricom 10G-PCIE-8A-R Fibre
rx-usecs=25 Coalescence ON
MTU 9000 bytes Max throughput 9.4 Gbit/s
Notice rate for 8972 byte packet
~0.002% packet loss in 10M packetsin receiving host
Sending host, 3 CPUs idle For <8 µs packets,
1 CPU is >90% in kernel modeinc ~10% soft int
Receiving host 3 CPUs idle For <8 µs packets,
1 CPU is 70-80% in kernel modeinc ~15% soft int
gig6-5_myri10GE
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 10 20 30 40Spacing between frames us
Re
cv W
ire r
ate
Mb
it/s 1000 bytes
1472 bytes
2000 bytes
3000 bytes
4000 bytes
5000 bytes
6000 bytes
7000 bytes
8000 bytes
8972 bytes
gig6-5_myri10GE
0
20
40
60
80
100
0 5 10 15 20 25 30 35 40Spacing between frames us
%c
pu
1 k
ern
el
sn
d
1000 bytes
1472 bytes
2000 bytes
3000 bytes
4000 bytes
5000 bytes
6000 bytes
7000 bytes
8000 bytes
8972 bytes
C
gig6-5_myri10GE
0
20
40
60
80
100
0 10 20 30 40Spacing between frames us
% c
pu
1
ke
rne
l re
c
1000 bytes
1472 bytes
2000 bytes
3000 bytes
4000 bytes
5000 bytes
6000 bytes
7000 bytes
8000 bytes
8972 bytes
CALICE, Mar 2007, R. Hughes-Jones Manchester8
10 GigE Cisco 7600: UDP Latency Motherboard: Supermicro X7DBE PCI-e 8 lane Linux Kernel 2.6.20 SMP Myricom NIC 10G-PCIE-8A-R Fibre
myri10ge v1.2.0 + firmware v1.4.10 Rx-usecs=0 Coalescence OFF MSI=1 Checksums ON
MTU 9000 bytes
Latency 36.6 µs & very well behaved Switch Latency 14.66 µs Switch internal: 0.0011 µs/byte
PCI-e 0.00054 10GigE 0.0008
gig6-Cisco-5_Myri_rxcoal0
y = 0.0046x + 36.6
0
10
20
30
40
50
60
70
80
90
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Message length bytes
La
ten
cy
us
CALICE, Mar 2007, R. Hughes-Jones Manchester9
The “SC05” Server PCs
Not ALL PCs work that well !! Boston/Supermicro X7DBE Two Intel Xeon Nocona
3.2 GHz Cache 2048k Shared 800 MHz FSBus
DDR2-400 Memory
Chipsets: Intel 7520 Lindenhurst
PCI 2 8 lane PCIe buses 1 4 lane PCIe buse 3* 133 MHz PCI-X
2 Gigabit Ethernet
CALICE, Mar 2007, R. Hughes-Jones Manchester10
10 GigE X7DBEX6DHE: UDP Throughput Kernel 2.6.20-web100_pktd-plus Myricom 10G-PCIE-8A-R Fibre
myri10ge v1.2.0 + firmware v1.4.10 rx-usecs=25
Coalescence ON MTU 9000 bytes Max throughput 6.3 Gbit/s
Packet loss ~ 40-60 % in receiving host
Sending host, 3 CPUs idle 1 CPU is >90% in kernel mode
Receiving host 3 CPUs idle For <8 µs packets,
1 CPU is 70-80% in kernel modeinc ~15% soft int
gig6-X6DHE_MSI_myri
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 10 20 30 40Spacing between frames us
Re
cv W
ire r
ate
Mb
it/s 1000 bytes
1472 bytes
2000 bytes
3000 bytes
4000 bytes
5000 bytes
6000 bytes
7000 bytes
8000 bytes
8972 bytes
gig6-X6DHE_MSI_myri
0
20
40
60
80
100
0 10 20 30 40Spacing between frames us
% c
pu
1
ke
rne
l re
c
1000 bytes
1472 bytes
2000 bytes
3000 bytes
4000 bytes
5000 bytes
6000 bytes
7000 bytes
8000 bytes
8972 bytes
gig6-X6DHE_MSI_myri
0
20
40
60
80
100
0 5 10 15 20 25 30 35 40Spacing between frames us
% P
ac
ke
t lo
ss
1000 bytes
1472 bytes
2000 bytes
3000 bytes
4000 bytes
5000 bytes
6000 bytes
7000 bytes
8972 bytes
8000 bytes
CALICE, Mar 2007, R. Hughes-Jones Manchester11
So now we can run at 9.4 Gbit/s
Can we do any work ?
CALICE, Mar 2007, R. Hughes-Jones Manchester12
10 GigE X7DBEX7DBE: TCP iperf No packet loss MTU 9000 TCP buffer 256k BDP=~330k Cwnd
SlowStart then slow growth Limited by sender !
Duplicate ACKs One event of 3 DupACKs
Packets Re-Transmitted
Throughput Mbit/s Iperf throughput 7.77 Gbit/s Not bad !
Web100 plots of TCP parameters
CALICE, Mar 2007, R. Hughes-Jones Manchester13
10 GigE X7DBEX7DBE: TCP iperf Packet loss 1: 50,000 -recv-kernel patch MTU 9000 TCP buffer 256k BDP=~330k Cwnd
SlowStart then slow growth Limited by sender !
Duplicate ACKs ~10 DupACKs every lost packet
Packets Re-Transmitted One per lost packet
Throughput Mbit/s Iperf throughput 7.84 Gbit/s Even Better !!!
Web100 plots of TCP parameters
CALICE, Mar 2007, R. Hughes-Jones Manchester14
10 GigE X7DBEX7DBE: CBR/TCP Packet loss 1: 50,000 -recv-kernel patch tcpdelay message 8120bytes Wait 7 µs RTT 36 µs TCP buffer 256k BDP=~330k Cwnd
Dips as expected
Duplicate ACKs ~15 DupACKs every lost packet
Packets Re-Transmitted One per lost packet
Throughput Mbit/s tcpdelay throughput 7.33 Gbit/s
Web100 plots of TCP parameters
CALICE, Mar 2007, R. Hughes-Jones Manchester15
Cpu0 : 6.0% us, 74.7% sy, 0.0% ni, 0.3% id, 0.0% wa, 1.3% hi, 17.7% si, 0.0% stCpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% stCpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% stCpu3 : 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st
B2B UDP with memory access Send UDP traffic B2B with 10GE On receiver run independent
memory write task L2 Cache 4096 k Byte Write 8000k Byte blocks in loop 100% user mode
Achievable UDP Throughput mean 9.39 Gb/s sigma 106 mean 9.21 Gb/s sigma 37 mean 9.2 sigma 30
Packet loss mean 0.04% mean 1.4 % mean 1.8 %
CPU load:
gig6-5_udpmon_membw
9000
9100
9200
9300
9400
9500
9600
0 10 20 30 40 50 60 70Trial number
Rec
v W
ire r
ate
Mbi
t/s
UDPUDP+cpu1UDP+cpu3
gig6-5_udpmon_membw
0
0.5
1
1.5
2
2.5
3
3.5
0 10 20 30 40 50 60 70Trial number
% P
acke
t lo
ss
UDPUDP+cpu1UDP+cpu3
CALICE, Mar 2007, R. Hughes-Jones Manchester16
Backup Slides
CALICE, Mar 2007, R. Hughes-Jones Manchester17
10 Gigabit Ethernet: Neterion NIC Results X5DPE-G2 Supermicro PCs B2B Dual 2.2 GHz Xeon CPU FSB 533 MHz XFrame II NIC PCI-X mmrbc 4096 bytes
Low UDP rates ~2.5Gbit/s Large packet loss
TCP One iperf TCP data stream
4 Gbit/s Two bi-directional iperf TCP
data streams 3.8 & 2.2 Gbit/s
s2io 9k 3d Feb 06
0
500
1000
1500
2000
2500
3000
3500
4000
0 5 10 15 20 25 30 35 40
Spacing between frames us
Re
cv
Wir
e r
ate
Mb
it/s
1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8000 bytes 8972 bytes
s2io 9k 3d Feb 06
0
10
20
30
40
5060
70
80
90
100
0 5 10 15 20 25 30 35 40Spacing between frames us
% P
acke
t lo
ss
1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8000 bytes 8972 bytes