15
CASTOR / GridFTP CASTOR / GridFTP Emil Knezo Emil Knezo PPARC-LCG-Fellow PPARC-LCG-Fellow CERN IT-ADC CERN IT-ADC GridPP 7 GridPP 7 th th Collaboration Meeting, Oxford Collaboration Meeting, Oxford UK UK July 1st 2003 July 1st 2003

CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

Embed Size (px)

Citation preview

Page 1: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

CASTOR / GridFTPCASTOR / GridFTP

Emil KnezoEmil Knezo

PPARC-LCG-FellowPPARC-LCG-Fellow

CERN IT-ADCCERN IT-ADC

GridPP 7GridPP 7thth Collaboration Meeting, Oxford UK Collaboration Meeting, Oxford UK

July 1st 2003July 1st 2003

Page 2: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

Outline of this talkOutline of this talk

Introduction to CASTOR HSMIntroduction to CASTOR HSM

CASTOR/GridFTP approachCASTOR/GridFTP approach

GridFTP problemsGridFTP problems

CASTOR/GridFTP test serviceCASTOR/GridFTP test service

Configuration issuesConfiguration issues

Usage examplesUsage examples

Plan for CASTOR/GridFTP servicePlan for CASTOR/GridFTP service

Page 3: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

CASTORCASTOR

CASTOR Mass Storage System evolved from SHIFT (tape management system of 90’s)

CASTOR is HSM

Today @ CERN: 2066.37 TB of data of 10.51 M files stored in CASTOR

CASTOR provides to users:Name space

File names are in the form:

/castor/domain_name/experiment_name/…

for example: /castor/cern.ch/cms/

/castor/domain_name/user/…

for example: /castor/cern.ch/user/k/knezo

POSIX compliant I/O: RFIO+ 64-bits support, streaming mode;

- security

Page 4: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

CASTOR current layoutCASTOR current layout

NAMEserver

STAGER

RFIOD(DISK

MOVER)

TPDAEMON(PVR)

MSGD

DISK POOL

NAMEserver

RTCOPYCLIENT

VDQMserver

RTCPD

VDQMserver

RFIOClient

VOLUMEmanager

RTCPD(TAPE

MOVER)

Page 5: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

GridFTP for CASTORGridFTP for CASTOR

Motivation for GridFTP interface to CASTORMotivation for GridFTP interface to CASTORLCG

Data-movement protocol to couple different HSM systems of Tier-1 centers

Used by Replica Management System

Experiments Offer experiments a secure alternative to rfio and FTP

Support CMS world-wide production starting in JulyMid-July 2003: 1TB per day to CASTOR from 12 regional

centers

February 2004: several TB per day from/to CASTOR

Approach for GridFTP interface to CASTORApproach for GridFTP interface to CASTORModification of external GridFTP server to act as rfio-client to CASTOR

Solution already proven for FTP servers

Not enough man-power do develop and maintain our own server

Development time restriction

Page 6: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

Selected GridFTP serverSelected GridFTP serverGlobus Toolkit GridFTP-1.5 serverGlobus Toolkit GridFTP-1.5 server

Based on wu-ftp 2.6.2

Widely usedexpected good support

Supported GridFTP extensions:EBLOCK mode

PARALLEL transfer

REST STREAM

DCAU

ERET, ESTO

Also supported:Third-party transfer

PBSZ, PROT

MDTM

Not supported GridFTP extensions:STRIPING, SPAS, STOR

ABUF, SBUF

GridFTPprocess

2811

DataControl

RFIO

GridFTP

CASTOR stager

GridFTP server

Tapes

Page 7: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

GridFTP problemsGridFTP problems

FirewallsFirewallsBi-directional data transfer in EBLOCK mode

Cannot open data-connection – blocked by firewall

Firewalls with NATGSI mutual authentication errors

HSMHSMData existing in HSM name space are not always readily accessible:

Possible disconnection of idle control channel socket by some firewalls

Third-party transfer from HSM suffers from data-connection accept timeout at the data-receiving end.

SolutionSolutionFirewall:

Do not use firewalls with NAT

Do not block data-connections in firewall

HSM:Always pre-stage your data in HSM before transfer

Currently with CASTOR “stagein” command; later when available with SRM interface.

Page 8: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

External network connectionExternal network connection

GridFTP data-connections to/from CASTOR GridFTP server are routed via 1Gb/s High Throughput Access Route (HTAR)

GridFTP control-connections are routed via PIX (TCP window size is fixed to 64kB if data-connection goes via PIX).

We share 1Gb/s link to GEANT,622 Mb/s connection to US institutes.

Only high # ports connections (data-connections) to/from CASTOR GridFTP server are routed via HTAR

Port #s interval currently applicable:<50k,51k>

Configuration issue

router

router

PIX

GridFTPserver

HTAR1Gb/s

1Gb/s

1Gb/s

1Gb/s

350Mb/s half-duplex

350Mb/s half-duplex

622Mb/s 2.5Gb/s

GEANT US-link DataTAG

CERN

Page 9: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

CASTOR/GridFTP test-serviceCASTOR/GridFTP test-service

Test service in operation from mid-January 2003Test service in operation from mid-January 2003Installation based onInstallation based on

EDG Globus, rel.24 (January – middle of June)VDT 1.1.8 (since middle of June)

SupportsSupportsAll EDG GridFTP clients, globus-url-copy

Still on server-code To-Do listStill on server-code To-Do list64-bit file support (currently no files > 2GB)CWD, CDUP fails on CASTOR name-space (“..” problem).In the meantime, full path is to be used by clients for CASTOR filesInternal “ls” to go fully rfio, at the moment CASTOR’s “nsls”client usedTest some GridFTP commands currently not used by supportedGridFTP clients (ESTO, ERET)

1Gbit/s

(via HTAR since mid-May)1 Gbit/s GEANT link

rfioGridFTP

wacdr002d

CERNCASTOR

GridFTP

622 Mbit/s US link

Page 10: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

Evolution of Evolution of CASTOR/GridFTP serviceCASTOR/GridFTP service

Set of configurations extended by UID--Stager mapping

DNS-load balancing (still to be verified)

Stager-response logging

Increased data-connection accept timeout (20 min)

griftpd

griftpd

Serv_1

Serv_2

griftpd

Serv_n…

stageatlas

cms001d

stagepublic

CASTOR

UID – stager mapping

DNS load-balancing

GridFTP via HTAR

rfio

Page 11: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

Performance and statisticsPerformance and statistics

PerformancePerformanceCERN internal transfer was: 5MB/s in/out; now: 7MB/s in/outTransfer from NIKHEF was 3MB/s in/out; now: not available yet

Standard CERN TCP configuration (64kB TCP buffer size)Not via HTAR 10 parallel streams

StatisticsStatisticsNot properly kept

Ftp-xferlog file – no file size for outbound trafficGridFTP-xferlog – repeated file-record per every parallel stream of a transfer

Example: 2 weeks statistics May 26 – June 9:Transferred 1480 files (1217 inbound, 263 outbound)627,425 GB stored to CASTOR via GridFTP wacdr002d serviceMain user: ATLAS

gppui04.gridpp.rl.ac.uk, aftpexp.bnl.gov, lscf.nbi.dk

Page 12: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

DN -- User mappingDN -- User mappingEDG-mechanisms usedEDG-mechanisms used

grid-mapfile with mapping granularity on VO-levelCurrently un-maintainable to have user-level mapping granularity

No dynamic pool accounts; edg-gridmap.conf:group ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=org alice001 group ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=org atlas001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=org cms001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=org lhcb001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=biomedical,dc=eu-datagrid,dc=org biome001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=earthob,dc=eu-datagrid,dc=org ob001 group ldap://marianne.in2p3.fr/ou=ITeam,o=testbed,dc=eu-datagrid,dc=org iteam001 group ldap://marianne.in2p3.fr/ou=wp6,o=testbed,dc=eu-datagrid,dc=org wpsix001

Up to VO Admin to create subsets of users (new LDAP URLs) for other UIDs

One DN – One User restrictionHard to sell to experiments

VOMS should solve the problemVOMS provide <DN + role> based UID mapping

VOMS to be tested with CASTOR GridFTP server (configuration issue)

Page 13: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

Umask and usage examplesUmask and usage examplesUmask 002 => “rw-rw-r—” permissions on CASTOR

Per server umask configuration

CASTOR at the moment still requires world-readable files

Usage examplesPrestage file

stagein [-h wacdr002d] -M /castor/cern.ch/atlas/subdirectory/file.name(will be replaced by SRM call)

Retrieve file from CASTORglobus-url-copy [-p 10] gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.namefile:///home/knezo/file.name

Third party transfer from CASTORglobus-url-copy [-p 10] gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.namegsiftp://spider.usatlas.bnl.gov/usatlas/workarea/knezo/file.name

Directory listingedg-gridftp-ls –verbose gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/

Page 14: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

Plan for CASTOR/GridFTP Plan for CASTOR/GridFTP serviceservice

One year horizonOne year horizonSupport for CMS world-wide production

This is now High Priority Task

Performance challenge for serverRequires TCP-tuning, likely dedicated stager, maybe NAPI

DNS load-balanced cluster of GridFTP serversSufficient for users with no strict throughput requirements for the coming year (ATLAS, LHCB, EDG)

Service To-Do listPerformance tuning

DNS-load balancing configuration tests

Prepare user & admin documentation, plus rpmsShown interest from external institutes: INFN, IFAE, IFIC

Integrate with CERN monitoring, plus scripts to create server usage statistics

Still to improve logging

Synchronisation on package upgrades with EDG

VOMS to improve DN–User mapping

Beyond one yearBeyond one yearNeed to understand what the Globus GridFTP server evolution will be.

Page 15: CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003 CASTOR & GridFTP / Emil

Knezo CERN

ConclusionsConclusions

GridFTP interface to CASTOR already existsGridFTP interface to CASTOR already exists

Ready to use service requires to solve:Ready to use service requires to solve:Configuration issues

Performance issues

Admin issues

CASTOR/GridFTP service has potential to satisfy CASTOR/GridFTP service has potential to satisfy CASTOR users for a yearCASTOR users for a year