Upload
katelyn-mclean
View
212
Download
0
Embed Size (px)
Citation preview
CASTOR / GridFTPCASTOR / GridFTP
Emil KnezoEmil Knezo
PPARC-LCG-FellowPPARC-LCG-Fellow
CERN IT-ADCCERN IT-ADC
GridPP 7GridPP 7thth Collaboration Meeting, Oxford UK Collaboration Meeting, Oxford UK
July 1st 2003July 1st 2003
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
Outline of this talkOutline of this talk
Introduction to CASTOR HSMIntroduction to CASTOR HSM
CASTOR/GridFTP approachCASTOR/GridFTP approach
GridFTP problemsGridFTP problems
CASTOR/GridFTP test serviceCASTOR/GridFTP test service
Configuration issuesConfiguration issues
Usage examplesUsage examples
Plan for CASTOR/GridFTP servicePlan for CASTOR/GridFTP service
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
CASTORCASTOR
CASTOR Mass Storage System evolved from SHIFT (tape management system of 90’s)
CASTOR is HSM
Today @ CERN: 2066.37 TB of data of 10.51 M files stored in CASTOR
CASTOR provides to users:Name space
File names are in the form:
/castor/domain_name/experiment_name/…
for example: /castor/cern.ch/cms/
/castor/domain_name/user/…
for example: /castor/cern.ch/user/k/knezo
POSIX compliant I/O: RFIO+ 64-bits support, streaming mode;
- security
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
CASTOR current layoutCASTOR current layout
NAMEserver
STAGER
RFIOD(DISK
MOVER)
TPDAEMON(PVR)
MSGD
DISK POOL
NAMEserver
RTCOPYCLIENT
VDQMserver
RTCPD
VDQMserver
RFIOClient
VOLUMEmanager
RTCPD(TAPE
MOVER)
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
GridFTP for CASTORGridFTP for CASTOR
Motivation for GridFTP interface to CASTORMotivation for GridFTP interface to CASTORLCG
Data-movement protocol to couple different HSM systems of Tier-1 centers
Used by Replica Management System
Experiments Offer experiments a secure alternative to rfio and FTP
Support CMS world-wide production starting in JulyMid-July 2003: 1TB per day to CASTOR from 12 regional
centers
February 2004: several TB per day from/to CASTOR
Approach for GridFTP interface to CASTORApproach for GridFTP interface to CASTORModification of external GridFTP server to act as rfio-client to CASTOR
Solution already proven for FTP servers
Not enough man-power do develop and maintain our own server
Development time restriction
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
Selected GridFTP serverSelected GridFTP serverGlobus Toolkit GridFTP-1.5 serverGlobus Toolkit GridFTP-1.5 server
Based on wu-ftp 2.6.2
Widely usedexpected good support
Supported GridFTP extensions:EBLOCK mode
PARALLEL transfer
REST STREAM
DCAU
ERET, ESTO
Also supported:Third-party transfer
PBSZ, PROT
MDTM
Not supported GridFTP extensions:STRIPING, SPAS, STOR
ABUF, SBUF
GridFTPprocess
2811
DataControl
RFIO
GridFTP
CASTOR stager
GridFTP server
Tapes
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
GridFTP problemsGridFTP problems
FirewallsFirewallsBi-directional data transfer in EBLOCK mode
Cannot open data-connection – blocked by firewall
Firewalls with NATGSI mutual authentication errors
HSMHSMData existing in HSM name space are not always readily accessible:
Possible disconnection of idle control channel socket by some firewalls
Third-party transfer from HSM suffers from data-connection accept timeout at the data-receiving end.
SolutionSolutionFirewall:
Do not use firewalls with NAT
Do not block data-connections in firewall
HSM:Always pre-stage your data in HSM before transfer
Currently with CASTOR “stagein” command; later when available with SRM interface.
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
External network connectionExternal network connection
GridFTP data-connections to/from CASTOR GridFTP server are routed via 1Gb/s High Throughput Access Route (HTAR)
GridFTP control-connections are routed via PIX (TCP window size is fixed to 64kB if data-connection goes via PIX).
We share 1Gb/s link to GEANT,622 Mb/s connection to US institutes.
Only high # ports connections (data-connections) to/from CASTOR GridFTP server are routed via HTAR
Port #s interval currently applicable:<50k,51k>
Configuration issue
router
router
PIX
GridFTPserver
HTAR1Gb/s
1Gb/s
1Gb/s
1Gb/s
350Mb/s half-duplex
350Mb/s half-duplex
622Mb/s 2.5Gb/s
GEANT US-link DataTAG
CERN
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
CASTOR/GridFTP test-serviceCASTOR/GridFTP test-service
Test service in operation from mid-January 2003Test service in operation from mid-January 2003Installation based onInstallation based on
EDG Globus, rel.24 (January – middle of June)VDT 1.1.8 (since middle of June)
SupportsSupportsAll EDG GridFTP clients, globus-url-copy
Still on server-code To-Do listStill on server-code To-Do list64-bit file support (currently no files > 2GB)CWD, CDUP fails on CASTOR name-space (“..” problem).In the meantime, full path is to be used by clients for CASTOR filesInternal “ls” to go fully rfio, at the moment CASTOR’s “nsls”client usedTest some GridFTP commands currently not used by supportedGridFTP clients (ESTO, ERET)
1Gbit/s
(via HTAR since mid-May)1 Gbit/s GEANT link
rfioGridFTP
wacdr002d
CERNCASTOR
GridFTP
622 Mbit/s US link
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
Evolution of Evolution of CASTOR/GridFTP serviceCASTOR/GridFTP service
Set of configurations extended by UID--Stager mapping
DNS-load balancing (still to be verified)
Stager-response logging
Increased data-connection accept timeout (20 min)
griftpd
griftpd
Serv_1
Serv_2
griftpd
Serv_n…
…
stageatlas
cms001d
stagepublic
CASTOR
UID – stager mapping
DNS load-balancing
GridFTP via HTAR
rfio
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
Performance and statisticsPerformance and statistics
PerformancePerformanceCERN internal transfer was: 5MB/s in/out; now: 7MB/s in/outTransfer from NIKHEF was 3MB/s in/out; now: not available yet
Standard CERN TCP configuration (64kB TCP buffer size)Not via HTAR 10 parallel streams
StatisticsStatisticsNot properly kept
Ftp-xferlog file – no file size for outbound trafficGridFTP-xferlog – repeated file-record per every parallel stream of a transfer
Example: 2 weeks statistics May 26 – June 9:Transferred 1480 files (1217 inbound, 263 outbound)627,425 GB stored to CASTOR via GridFTP wacdr002d serviceMain user: ATLAS
gppui04.gridpp.rl.ac.uk, aftpexp.bnl.gov, lscf.nbi.dk
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
DN -- User mappingDN -- User mappingEDG-mechanisms usedEDG-mechanisms used
grid-mapfile with mapping granularity on VO-levelCurrently un-maintainable to have user-level mapping granularity
No dynamic pool accounts; edg-gridmap.conf:group ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=org alice001 group ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=org atlas001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=org cms001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=org lhcb001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=biomedical,dc=eu-datagrid,dc=org biome001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=earthob,dc=eu-datagrid,dc=org ob001 group ldap://marianne.in2p3.fr/ou=ITeam,o=testbed,dc=eu-datagrid,dc=org iteam001 group ldap://marianne.in2p3.fr/ou=wp6,o=testbed,dc=eu-datagrid,dc=org wpsix001
Up to VO Admin to create subsets of users (new LDAP URLs) for other UIDs
One DN – One User restrictionHard to sell to experiments
VOMS should solve the problemVOMS provide <DN + role> based UID mapping
VOMS to be tested with CASTOR GridFTP server (configuration issue)
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
Umask and usage examplesUmask and usage examplesUmask 002 => “rw-rw-r—” permissions on CASTOR
Per server umask configuration
CASTOR at the moment still requires world-readable files
Usage examplesPrestage file
stagein [-h wacdr002d] -M /castor/cern.ch/atlas/subdirectory/file.name(will be replaced by SRM call)
Retrieve file from CASTORglobus-url-copy [-p 10] gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.namefile:///home/knezo/file.name
Third party transfer from CASTORglobus-url-copy [-p 10] gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.namegsiftp://spider.usatlas.bnl.gov/usatlas/workarea/knezo/file.name
Directory listingedg-gridftp-ls –verbose gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
Plan for CASTOR/GridFTP Plan for CASTOR/GridFTP serviceservice
One year horizonOne year horizonSupport for CMS world-wide production
This is now High Priority Task
Performance challenge for serverRequires TCP-tuning, likely dedicated stager, maybe NAPI
DNS load-balanced cluster of GridFTP serversSufficient for users with no strict throughput requirements for the coming year (ATLAS, LHCB, EDG)
Service To-Do listPerformance tuning
DNS-load balancing configuration tests
Prepare user & admin documentation, plus rpmsShown interest from external institutes: INFN, IFAE, IFIC
Integrate with CERN monitoring, plus scripts to create server usage statistics
Still to improve logging
Synchronisation on package upgrades with EDG
VOMS to improve DN–User mapping
Beyond one yearBeyond one yearNeed to understand what the Globus GridFTP server evolution will be.
1/7/2003 CASTOR & GridFTP / Emil
Knezo CERN
ConclusionsConclusions
GridFTP interface to CASTOR already existsGridFTP interface to CASTOR already exists
Ready to use service requires to solve:Ready to use service requires to solve:Configuration issues
Performance issues
Admin issues
CASTOR/GridFTP service has potential to satisfy CASTOR/GridFTP service has potential to satisfy CASTOR users for a yearCASTOR users for a year