GridPP meeting Feb 03 R. Hughes-Jones Manchester
WP7: Networking for Grids Grid Network monitoring
Provide information for Middleware & Applications – Network Cost Function
Understand the networks we use
Provide Information for capacity planning
Creation of schemas and publishing the monitoring data
Investigation of Protocols TCP and non-TCP Testing the work of CS groups / IETF NOT inventing
Close technical collaboration with NRNs, DANTE and the DataTAG project
The High Bandwidth High Throughput Challenge Investigation of end Host Networking and Disk sub-systems
To show what can be achieved on production networks with: Multiple streams of TCP packets Tuned TCP parameters Different TCP stacks
Applying the knowledge to the real Grid user community
GridPP meeting Feb 03 R. Hughes-Jones Manchester
NetworkCost Architecture
PCP
Distributed Data CollectorRaw
IPerf UDPmon GridFTPPingEr
Measure
CollectAndStorage
Processing
Archive
NetworkCost
R-GMAGlobus MDS
GridPP meeting Feb 03 R. Hughes-Jones Manchester
NetworkCost functionality
13,084,046,534,5CNAF
7,086,2410,385,03IN2P3
2,6611,863,2511,13NIKHEF
4,357,122,447,46RAL
35,4444,8777,78
46,75CERN
CNAFIN2P3NIKHEFRALCERN
cost[][] = getNetworkCost (SE[], SE[])
FileSize= 11 MB
GridPP meeting Feb 03 R. Hughes-Jones Manchester
High throughput transfer challenges Large amounts of data have to be transferred between Mass Storage
Systems and CEs in Europe (and world wide!)
EU demonstration sent HEP data from CERN to NIKHEF/SARA at high rates
It was to show what can be achieved with: Multiple streams of TCP packets Tuned TCP parameters:
Interface txqueuelen 2000TCP buffer size to match the BW * rtt
Different TCP stacks:Standard TCPFast TCPScalable TCP
Fair sharing between stacks
This highlights the results of close technical collaboration with NRNs, DANTE and other projects: DataTAG, Mb-NG, UK- Star- Nether- Light
GridPP meeting Feb 03 R. Hughes-Jones Manchester
Demo Setup for the EDG Review
SurfNetSurfNet
NIKHEF
CERN
GEANT
GEANT
Shows data transfers from Mass Storage system at CERN to Mass Storage system at NIKHEF/SARA
Disk sub-system I/O bandwidth of ~70 MB/s
All systems have Gigabit Ethernet connectivity
Use GridFTP and Measure disk to disk performance
GridPP meeting Feb 03 R. Hughes-Jones Manchester
Demo Consisted of:
Raid0Disk
Data over TCP Streams
Raid0Disk
GridFTP GridFTP
Dante MonitoringNode Monitoring Site Monitoring
GridPP meeting Feb 03 R. Hughes-Jones Manchester
SuperJANET4
CERN
Sara & NIKHEFSURFnet
European Topology: NRNs, Geant, Sites
GridPP meeting Feb 03 R. Hughes-Jones Manchester
Throughput on the day !
The view from GÉANT – with thanks to Dante
GridPP meeting Feb 03 R. Hughes-Jones Manchester
Some Measurements of Throughput CERN -SARA
Standard TCP txlen 100 25 Jan03
0
100
200
300
400
500
1043509370 1043509470 1043509570 1043509670 1043509770
Time
I/f
Rat
e M
bits
/s
00.20.40.60.811.21.41.61.82
Re
cv. R
ate
Mb
its/s
Out Mbit/s In Mbit/s
Hispeed TCP txlen 2000 26 Jan03
0
100
200
300
400
500
1043577520 1043577620 1043577720 1043577820 1043577920Time
I/f
Rat
e M
bits
/s
00.20.40.60.811.21.41.61.82
Rec
v. R
ate
Mbi
ts/s
Out Mbit/s
In Mbit/s
Using the GÉANT Backup Link
1 GByte file transfers
Standard TCP
Average Throughput 167 Mbit/s
Users see 5 - 50 Mbit/s!
High-Speed TCP
Average Throughput 345 Mbit/s
Scalable TCP
Average Throughput 340 Mbit/s
Scalable TCP txlen 2000 27 Jan03
0
100
200
300
400
500
1043678800 1043678900 1043679000 1043679100 1043679200Time
II/f
Rat
e M
bits
/s
00.20.40.60.811.21.41.61.82
Re
cv. R
ate
Mb
its/s
Out Mbit/s
In Mbit/s
GridPP meeting Feb 03 R. Hughes-Jones Manchester
What the Users Really find:
CERN – RAL using production GÉANT
CMS Tests 8 streams
50 Mbit/s @ 15 MB buffer
Firewall 100 Mbit/s
NNW – SJ4 Access
1 Gbit link
CERN -RAL 12 Dec 02
0102030405060708090
0 10 20 30 40 50time 0.5 hr
hro
ughput
Mbit/s
Total RateRate/Stream
GridPP meeting Feb 03 R. Hughes-Jones Manchester
WP7 High Throughput Achievements
Close Collaboration with Dante
“Low” layer QOS testing over GEANT
LBE
IP premium
iGrid 2002 and ER 2002 : UDP with LBE
Network performances evaluation
EU Review 2003 : application level transfer with real data between EDG sites
proof of concept
GridPP meeting Feb 03 R. Hughes-Jones Manchester
Conclusions More research on TCP stacks and its implementation is needed
ie HEP-style applied research -
Continue the collaboration with NRNs & Dante to:
Understand the behavior of National networks & GEANT backbone
Learn the benefits of QoS deployment
WP7 is taking the “Computer Science” research and knowledge of the TCP protocol & implementation and applying it to the network for real Grid users
Enabling Knowledge Transfer to sysadmins and end users
EDG release 1.4.x has configuration scripts for TCP parameters for SE and CE
Network tutorials for end users
Work with users – focus on 1 or 2 sites to try to get improvements