71
June 21-25, 2004 Lecture4: Grid Data Managemen t 1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group [email protected] Slides prepared in part by Scott Koranda UW-Milwaukee & NCSA [email protected] Grid Summer Workshop June 21- 25, 2004

June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group [email protected] Slides prepared in

Embed Size (px)

Citation preview

Page 1: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 1

Lecture 4Grid Data ManagementJaime Frey

UW-Madison Condor [email protected]

Slides prepared in part by Scott Koranda UW-Milwaukee & [email protected]

Grid Summer Workshop June 21-25, 2004

Page 2: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 2

Motivation?

Why is the Grid community concerned with data/file management?

Why might you be concerned with data/file management?

Page 3: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 3

Motivation: The Data Problem Motivate our discussion with the large physics

experiments (part of GriPhyN and Grid2003) Laser Interferometer Gravitational Wave Observatory

Detect spacetime ripples from blackholes & other sources Generates data at 10 MB per second, just under 1 TB per day

Sloan Digital Sky Survey Catalog more stars and galaxies then ever before More than 15 TB of data catalogs

Compact Muon Solenoid and ATLAS Detect the Higgs Boson (a fundamental particle) 100 MB per second, about 1 Petabyte per year (per detector)

Page 4: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 4

Really Two Data Problems The amount of data

High-performance tools needed to manage the huge raw volume of data

Store it Move it

Measure in terabytes, petabytes, and ??? The number of data files

High-performance tools needed to manage the huge number of filenames

1012 filenames is expected soon Collection of 1012 of anything is a lot to handle efficiently

Page 5: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 5

Three Data Questions on the GridEssentially three (3) questions for which you want

Grid tools to address

1. What data/files exist?

2. What data/files are where?

3. How do I move data/files from A to B?

Page 6: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 6

Three Data Questions on the GridExamine these questions last to first

…because even if you don’t have TBs of data you will want to move files so start with #3

1. What data/files exist?

2. What data/files are where?

3. How do I move data/files from A to B?

Page 7: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 7

How to move data/files? Requirements

Fast – as fast as networks and protocols allow I2 sites should expect at least 10 MB/s sustained

Secure Server must only share files with strongly authenticated

clients No passwords in the clear or similar

Robust Fault tolerant, time-tested protocol

Page 8: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 8

GridFTP Extension to well known File Transfer Protocol

(FTP) http://www.globus.org/datagrid/deliverables/C2WPdraft3.pdf

Extensions include Strong authentication, encryption via Globus GSI Multiple, parallel data channels Third-party transfers Tunable network & I/O parameters Server side processing, command pipelining

Page 9: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 9

Necessary Semantics… GridFTP is the protocol A server or client that implements the GridFTP

protocol is GridFTP-enabled or Grid-enabled Often hear “the GridFTP server…” or “the GridFTP

client…” Correct is “the GridFTP-enabled server from the

Globus team” or the particular client being used Let it slide…easier to use the slang…but Distinction more important soon as groups outside of

Globus release GridFTP-enabled clients & servers

Page 10: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 10

GridFTP Server Built on top of wuftpd, our old friend

A brand new server from scratch in beta now… Most configuration details same as wuftpd Runs as a inetd (xinetd) service

1. Connection is attempted on port 2811

2. Xinetd looks up port in /etc/services and finds responsible service

3. Xinetd starts service according to configuration with data from communication send on stdin

Page 11: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 11

GridFTP Server From /etc/services

[services]$ tail /etc/services

gsiftp 2811/tcp #Grid-FTP Serverglobus-gatekeeper 2119/tcp #Globus Gatekeeper

From /etc/xinetd.d/[xinetd.d]$ cat gsiftpservice gsiftp{ socket_type = stream protocol = tcp env = LD_LIBRARY_PATH=/opt/ldg-2.0/globus/lib wait = no user = root server = /opt/ldg-2.0/globus/sbin/in.ftpd server_args = -l -a -G /opt/ldg-2.0/globus log_on_success += DURATION USERID log_on_failure += USERID nice = 10 disable = no}

Page 12: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 12

GridFTP Server Environment variables

LD_LIBRARY_PATH Point to $GLOBUS_LOCATION/lib

GRIDMAP Path to grid-mapfile for authentication Generic GSI environment variable

X509_CERT_DIR Directory in which CA signing certificates held Generic GSI environment variable

Page 13: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 13

GridFTP Server Logging to system log

On most Linux /var/log/messagesJun 10 10:46:59 basil gridftpd[21857]: GSSAPI user

/DC=org/DC=doegrids/OU=People/CN=Scott Koranda 43845 is authorized as skoranda

Jun 10 10:46:59 basil gridftpd[21857]: FTP LOGIN FROM oregano.phys.uwm.edu [129.89.57.55], skoranda

Uses host certificate for mutual authentication[root@basil root]# grid-cert-info -file /etc/grid-security/hostcert.pem

-subject/DC=org/DC=doegrids/OU=Services/CN=basil.phys.uwm.edu

Page 14: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 14

GridFTP ServerThird-party transfers Client directs transfers between two servers

ygraine.aei.mpg.de

GridFTP client

basil.phys.uwm.edu

GridFTP serverldas-cit.ligo.caltech.edu

GridFTP server

“mov

e file

1 to

ldas

-cit.

ligo.

calte

ch.e

du”

file1

Page 15: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 15

GridFTP clients

Globus-url-copy GridFTP-compliant client from the Globus team Copy files from one URL to another URL

One URL is usually a gsiftp:// URL Another URL is usually a file:/ URL To move a file from remote GridFTP-enabled server to

local machineglobus-url-copy gsiftp://dataserver.phys.uwm.edu/data/file1

file:/home/skoranda/file1

Page 16: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 16

Globus-url-copy Alternative forms for file:/ URLsglobus-url-copy

gsiftp://dataserver.phys.uwm.edu/data/file1 file://localhost/home/skoranda/file1

globus-url-copy gsiftp://dataserver.phys.uwm.edu/data/file1 file://basil.phys.uwm.edu/home/skoranda/file1

If GridFTP server runs on a non-standard port?globus-url-copy

gsiftp://dataserver.phys.uwm.edu:15000/data/file1 file:/home/skoranda/file1

Page 17: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 17

Globus-url-copy To put file onto server reverse URLsglobus-url-copy file:/home/skoranda/file1

gsiftp://dataserver.phys.uwm.edu/data/file1

By default 1 data channel used average performance monitor performance using –vb flag

$ globus-url-copy -vb gsiftp://ldas-cit.ligo.caltech.edu:15000/usr1/grid/smallfile file:/tmp/smallfile

9437184 bytes 658.09 KB/sec avg 512.95 KB/sec inst

Page 18: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 18

Going fast Multiple channels dramatically boosts ‘xfer rate$ globus-url-copy -vb -p 4

gsiftp://ldas-cit.ligo.caltech.edu:15000/usr1/grid/largefile file:/tmp/largefile

523960320 bytes 5814.25 KB/sec avg 5568.27 KB/sec inst

Still faster by using large TCP windows$ globus-url-copy -vb -p 4 -tcp-bs 1048576 gsiftp://ldas-

cit.ligo.caltech.edu:15000/usr1/grid/largefile file:/tmp/largefile

514392064 bytes 6609.67 KB/sec avg 8639.71 KB/sec inst

Still faster by using large memory buffers$ globus-url-copy -vb -p 4 -bs 1048576 -tcp-bs 1048576 gsiftp://ldas-

cit.ligo.caltech.edu:15000/usr1/grid/largefile file:/tmp/largefile

523304960 bytes 7300.56 KB/sec avg 9311.99 KB/sec inst

Page 19: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 19

Faster! Depending on network & weather you can go

very fast!$ globus-url-copy -vb -p 8 -bs 1048576 -tcp-bs 1048576 gsiftp://ldas-

cit.ligo.caltech.edu:15000/usr1/grid/largefile file:/tmp/largefile

185270272 bytes 18092.57 KB/sec avg 25153.96 KB/sec inst

Page 20: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 20

Third-party transfers Transfers from server to server directed by client

Use gsiftp:// URLs for both requires both servers be configured to allow 3rd party

$ hostname basil.phys.uwm.edu

$ globus-url-copy gsiftp://hydra.phys.uwm.edu/tmp/file1 gsiftp://contra.phys.uwm.edu/tmp/file1

Page 21: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 21

DebuggingUse –dbg to see control channel communication$ globus-url-copy -dbg gsiftp://hydra.phys.uwm.edu/tmp/file1 file:/tmp/file1debug: starting to get gsiftp://hydra.phys.uwm.edu/tmp/file1debug: connecting to gsiftp://hydra.phys.uwm.edu/tmp/file1debug: response from gsiftp://hydra.phys.uwm.edu/tmp/file1:220 hydra.phys.uwm.edu GridFTP Server 1.12 GSSAPI type Globus/GSI wu-2.6.2 (gcc32dbg,

1069715860-42) ready. debug: authenticating with gsiftp://hydra.phys.uwm.edu/tmp/file1debug: response from gsiftp://hydra.phys.uwm.edu/tmp/file1:230 User skoranda logged in. debug: sending command:FEAT debug: response from gsiftp://hydra.phys.uwm.edu/tmp/file1:211-Extensions supported: REST STREAM ESTO ERET MDTM SIZE PARALLEL DCAU211 END<snip>

Page 22: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 22

Globus-url-copy

Acutally a general purpose URL copying tool No GSI authentication used Parallel channels and like won’t work

$ globus-url-copy http://www.yahoo.com file:/tmp/yahoo

$ globus-url-copy ftp://ftp.globus.org/banner.msg file:/tmp/banner.msg

Page 23: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 23

GridFTP clients UberFTP

developed and supported at National Center for Supercomputing Applications (NCSA)

interactive like our old (insecure) friend ‘ftp’ use –a GSI for GSI authentication supports multiple channels using –c flag$ uberftp -H hydra.phys.uwm.edu -a GSI220 hydra.phys.uwm.edu GridFTP Server 1.12 GSSAPI type Globus/GSI wu-2.6.2 (gcc32dbg, 1069715860-42) ready.

230 User skoranda logged in.uberftp>

Page 24: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 24

GridFTP clients “Roll your own” Add functionality directly to your applications

Your application find and download its own data? Your application deliver output data files when

finished computing? Globus Toolkit offers APIs to code against

C Java Python

Page 25: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 25

GridFTP and Firewalls Nice document by Globus team at

http://www.globus.org/security/firewalls/Globus Firewall Requirements-5.pdf

Tip: when debugging GridFTP and firewalls remember which way connections established 1 single data channel

data connection established from client to server 2 or more data channels

data connection established in direction data will flow control connection always from client to server

Page 26: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 26

Hints for ExpertsTo make GridFTP go really fast use fast disks/filesystems

filesystem should read/write > 30 MB/second configure TCP for performance

See TCP Tuning Guide athttp://www-didc.lbl.gov/TCP-tuning/

patch your Linux kernel with web100 patch See http://www.web100.org Important work-around for Linux TCP “feature”

understand your network path

Page 27: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 27

Three Data Questions on the Grid

1. What data/files exist?

2. What data/files are where?

3. How do I move data/files from A to B?

Page 28: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 28

What data/files are where? Requirements

Catalog 108 files and their locations What files are where (possibly at more then one place) Across multiple sites within a Grid Mappings from logical filenames (LFNs) to physical

filenames (PFNs) or URLs No single point of failure

No central catalog/server to be single point of failure

Page 29: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 29

Globus Replica Location Service

Globus RLS Each RLS server usually runs two catalogs

LRC Local replica catalog Catalog of what files you have (LFNs) and mappings to

URL(s) or PFNs RLI

Replica location index Catalog of while files (LFNs) that other LRCs in your data

grid know about

Page 30: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 30

Globus RLS Network of RLS servers inform each other

Each site has LRC with mappings of LFNs to PFNs usually contains the “local” mappings where files located at the site Site at Milwaukee might have this mapping in its LRC

H-R-792845521-16.gwf → gsiftp://dataserver.phys.uwm.edu/LIGO/H-R-792845521-16.gwf

LRC catalog at each site tells remote RLIs what LFNs it has mappings for Milwaukee tells Caltech it has a mapping for H-R-792845521-16.gwf

So Caltech RLI has mappingH-R-792845521-16.gwf → LRC at Milwaukee

Page 31: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 31

Globus RLS

file1→ gsiftp://serverA/file1file2→ gsiftp://serverA/file2

LRC

RLIfile3→ rls://serverB/file3file4→ rls://serverB/file4

rls://serverA:39281

file1file2

site A

file3→ gsiftp://serverB/file3file4→ gsiftp://serverB/file4

LRC

RLIfile1→ rls://serverA/file1file2→ rls://serverA/file2

rls://serverB:39281

file3file4

site B

Page 32: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 32

Globus RLSTypical way to query RLS network and find files in your Grid Ask your local LRC

“do you know about the file H-R-793274271.gwf?” If yes…

Ask your local LRC for the corresponding URL(s) It answers “H-R-793274271.gwf is at URL

gsiftp://basil.phys.uwm.edu/LIGO/H-R-793274271.gwf” If no…

Ask your local RLI “who does know about this file?” It answers “The RLS server at MIT knows about this file?” Go ask the MIT RLS server

“I am told you know about the file H-R-793274271.gwf…please tell me the URL for it?”

It answers “H-R-793274271.gwf is at URLgsiftp://ldas.mit.edu/LIGO/H-R-793274271.gwf”

Page 33: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 33

Globus RLS Quick Review

LFN → logical filename (think of as simple filename) PFN → physical filename (think of as a URL) LRC → your local catalog of maps from LFNs to PFNs

H-R-792845521-16.gwf → gsiftp://dataserver.phys.uwm.edu/LIGO/H-R-792845521-16.gwf

RLI → your local catalog of maps from LFNs to LRCs H-R-792845521-16.gwf → LRCs at MIT, PSU, Caltech, and UW-M

LRCs inform RLIs about mappings known

Find files on your Grid by querying RLI(s) to get LRC(s), then query LRC(s) to get URL(s)

Page 34: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 34

Globus RLS: Server Perspective

1. Listens on port 39281 (default) for clients2. Responds to client queries

what LFNs in local catalog, the LRC? what other LRCs know about LFNs? checks against access control list for each client

3. Accepts publishing of new LFNs into LRC add files to local catalog

4. Sends updates of LRC to other servers tell remote RLI catalogs what LFNs you have

mappings for locally

Page 35: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 35

Globus RLS: Server Perspective

Listens on port 39281 (default) for clients Server address is URL

rls://dataserver.phys.uwm.edu rls://dataserver.phys.uwm.edu:39281 rls://dataserver rls://localhost

Uses a host certificate to identify itself must run as root if host cert is owned by root often copy host cert/key to other non-root limited privilege

account and configure to use that copy

Page 36: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 36

Globus RLS: Server Perspective

Mappings LFNs → PFNs kept in database Uses generic ODBC interface to talk to any (good)

RDBM MySQL, PostgreSQL, Oracle, DB2,... All RDBM details hidden from administrator and user

well, not quite RDBM may need to be “tuned” for performance but one can start off knowing very little about RDBMs

Page 37: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 37

Globus RLS: Server Perspective

Mappings LFNs → LRCs stored in 1 of 2 ways table in database

full, complete listing from LRCs that update your RLI requires each LRC to send your RLI full, complete list

as number of LFNs in catalog grows, this becomes substantial 108 filenames at 64 bytes per filename ~ 6 GB

in memory in a special hash called Bloom filter 108 filenames stored in as little as 256 MB

easy for LRC to create Bloom filter and send over network to RLIs can cause RLI to lie when asked if knows about a LFN

only false-positives tunable error rate acceptable in many contexts

Page 38: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 38

Globus RLS: Configuring the Server Single configuration file

usually $GLOBUS_LOCATION/etc/globus-rls-server.conf

Send server a HUP signal to refresh configuration kill –SIGHUP <pid>

Access control each “client” given one or more of

lrc_read : permission to query the LRC for mappings lrc_update : permission to add new mappings in LRC rli_read : permission to query RLI for mappings rli_update : permission to inform RLI of remote LRC mappings stats : permission to query server for statistics admin : permission to change configuration on the fly

Page 39: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 39

Globus RLS: Configuring the Server Access control

access given to certificate subject acl /DC=org/DC=doegrids/OU=People/CN=Scott Koranda: lrc_read

access given to UID mapped in grid-mapfile which grid-mapfile examined controlled by GRIDMAP

environment variableacl skoranda: lrc_read

must give remote LRCs permission to update your RLI remote RLS server uses host certificate to identify itselfacl /DC=org/DC=doegrids/OU=Services/CN=ldas.mit.edu: rli_update

Page 40: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 40

Globus RLS: Configuring the Server globus-rls-admin tool for configuration

need GSI credential to talk to server must have acl with admin privileges for your credential manual page is availableNAME globus-rls-admin - Replica Location Service Administration SYNOPSIS globus-rls-admin -A|-a|-C option value|-c option|-D|-d|-e|-p|-q|-r|-S|-s|-t timeout|-u|-v [ rli ] [ pattern ] [ server ] DESCRIPTION The program globus-rls-admin performs administrative oper- ations on a RLS server (see globus-rls-server(8)).

ping the server to see if alive$ globus-rls-admin -p rls://localhostping rls://localhost: 0 seconds

Page 41: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 41

Globus RLS: Configuring the Server Query server for statistics$ globus-rls-admin -S rls://localhostVersion: 2.1.5Uptime: 02:46:19LRC stats update method: lfnlist update method: bloomfilter updates bloomfilter: rls://mini.astro.cf.ac.uk:39281 last 06/15/04 11:39:12 updates bloomfilter: rls://ygraine.aei.mpg.de:39281 last 12/31/69 18:00:00 updates bloomfilter: rls://ldas-cit.ligo.caltech.edu:39281 last 12/31/69

18:00:00 lfnlist update interval: 86400 bloomfilter update interval: 900 numlfn: 4110878 numpfn: 12328767 nummap: 12328775RLI stats updated by: rls://mini.astro.cf.ac.uk:39281 last 06/15/04 11:47:56 updated by: rls://ygraine.aei.mpg.de:39281 last 06/15/04 11:25:23 updated by: rls://ldas-cit.ligo.caltech.edu:39281 last 06/15/04 11:43:31 updated via bloomfilters

Page 42: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 42

Globus RLS: Configuring the Server Tell LRC what remote RLIs to update

local LRC should update the RLI at MIT using Bloom filter

$ globus-rls-admin –A rls://ldas.mit.edu rls://localhost

use –a if updating via lists rather than Bloom filter

Page 43: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 43

Globus RLS: Client PerspectiveTwo ways for clients to interact with RLS Server globus-rls-cli simple command-line tool

query create new mappings

“roll your own” client by coding against API Java C Python

Page 44: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 44

Globus-rls-cli

Simple query to LRC to find a PFN for LFN Note more then 1 PFN may be returned$ globus-rls-cli query lrc lfn H-R-714024224-16.gwf rls://dataserver:39281

H-R-714024224-16.gwf: file://localhost/netdata/s001/S1/R/H/714023808-714029599/H-R-714024224-16.gwf

H-R-714024224-16.gwf: file://medusa-slave001.medusa.phys.uwm.edu/data/S1/R/H/714023808-714029599/H-R-714024224-16.gwf

H-R-714024224-16.gwf: gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/cluster_storage/data/s001/S1/R/H/714023808-714029599/H-R-714024224-16.gwf

Server and client sane if LFN not found$ globus-rls-cli query lrc lfn "foo" rls://dataserver

LFN doesn't exist: foo

$ echo $?

1

Page 45: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 45

Globus-rls-cli Be sure to quote LFN if it has funny characters$ globus-rls-cli query lrc lfn file& rls://dataserver

[1] 16346

bash: rls://dataserver: No such file or directory

[datarobot@dataserver datarobot]$ connect(file): Bad URL: globus_url_parse(file): Error code -3

[1]+ Exit 1 globus-rls-cli query lrc lfn file

[datarobot@dataserver datarobot]$ globus-rls-cli query lrc lfn "file&" rls://dataserver

LFN doesn't exist: file&

Page 46: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 46

Globus-rls-cli

Wildcard searches of LRC supported probably a good idea to quote LFN wildcard expression

$ globus-rls-cli query wildcard lrc lfn "H-R-7140242*-16.gwf" rls://dataserver:39281

H-R-714024208-16.gwf: gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/cluster_storage/data/s001/S1/R/H/714023808-714029599/H-R-714024208-16.gwf

H-R-714024224-16.gwf: gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/cluster_storage/data/s001/S1/R/H/714023808-714029599/H-R-714024224-16.gwf

Page 47: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 47

Globus-rls-cli

Bulk queries also supported obtain PFNs for more then one LFN at a time$ globus-rls-cli bulk query lrc lfn H-R-714024224-16.gwf

H-R-714024320-16.gwf rls://dataserver

H-R-714024320-16.gwf: gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/cluster_storage/data/s001/S1/R/H/714023808-714029599/H-R-714024320-16.gwf

H-R-714024224-16.gwf: gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/cluster_storage/data/s001/S1/R/H/714023808-714029599/H-R-714024224-16.gwf

Page 48: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 48

Globus-rls-cli

Simple query to RLI to locate a LFN to LRC map then query that LRC for the PFN

$ globus-rls-cli query rli lfn H-R-714024224-16.gwf rls://dataserver

H-R-714024224-16.gwf: rls://ldas-cit.ligo.caltech.edu:39281

$ globus-rls-cli query lrc lfn H-R-714024224-16.gwf rls://ldas-cit.ligo.caltech.edu:39281

H-R-714024224-16.gwf: gsiftp://ldas-cit.ligo.caltech.edu:15000/archive/S1/L0/LHO/H-R-7140/H-R-714024224-16.gwf

Page 49: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 49

Globus-rls-cli Bulk queries to RLI also supported$ globus-rls-cli bulk query rli lfn H-R-714024224-16.gwf

H-R-714024320-16.gwf rls://dataserver H-R-714024320-16.gwf: rls://ldas-

cit.ligo.caltech.edu:39281 H-R-714024224-16.gwf: rls://ldas-

cit.ligo.caltech.edu:39281

Wildcard queries to RLI may not be supported! no wildcards when using Bloom filter updates

$ globus-rls-cli query wildcard rli lfn "H-R-7140242*-16.gwf" rls://dataserver

Operation is unsupported: Wildcard searches with Bloom filters

Page 50: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 50

Globus-rls-cliRLS with Bloomfilter updates to RLI fast and efficient Bloom filter is hash of information in a LRC remote LRC creates Bloom and sends it to RLI RLI can test to see if a particular LFN in the

LRC’s Bloom filter can’t do a wildcard search will sometimes lie! only false positives if can’t have any false positives use full list updates

Page 51: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 51

Globus-rls-cli

Create new LFN → PFN mappings use create to create 1st mapping for a LFN$ globus-rls-cli create file1 gsiftp://dataserver/file1

rls://dataserver

use add to add more mappings for a LFN$ globus-rls-cli add file1 file://dataserver/file1

rls://dataserver

use delete to remove a mapping for a LFN when last mapping is deleted for a LFN the LFN is also deleted cannot have LFN in LRC without a mapping

$ globus-rls-cli delete file1 file://file1 rls://dataserver

Page 52: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 52

Globus-rls-cliLRC can also store attributes about LFN and PFNs

size of LFN in bytes? md5 checksum for a LFN? ranking for a PFN or URL? extensible...you choose attributes to create and add can search catalog on the attributes attributes limited to

strings integers floating point (double) date/time

Page 53: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 53

Globus-rls-cli Create attribute first then add values for LFNs$ globus-rls-cli attribute define md5checksum lfn string

rls://dataserver

$ globus-rls-cli attribute add file1 md5checksum lfn string 42947c86b8a08f067b178d56a77b2650 rls://dataserver

Then query on the attribute$ globus-rls-cli attribute query file1 md5checksum lfn

rls://dataserver

md5checksum: string: 42947c86b8a08f067b178d56a77b2650

Page 54: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 54

Three Data Questions on the Grid

1. What data/files exist?

2. What data/files are where?

3. How do I move data/files from A to B?

Page 55: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 55

Metadata Catalog Metadata catalog

store data about...data! help answer question about what data exists

MCS from Globus still a research project One realization of a metadata catalog other projects offer solutions with different capabilities and

limitations very active research on what type of service a metadata catalog

should offer how should metadata information flow from site to site? is there a single solution for most uses on the Grid?

Page 56: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 56

Metadata Catalog One scenario useful in a Data Grid

data generated/collected into files at some detector site location of data files published into RLS

H-R-714024224-16.gwf → gsiftp://someserver/path/to/H-R-714024224-16.gwf existence of data files and important metadata published into

metadata catalogH-R-714024224-16.gwf →

data from detector in Hanford, WA raw data file contains all data (no downsampling) data starts at GPS time 714024224 file contains 16 seconds of data detector was in “science” mode with good noise properties a simulated pulsar signal was being injected at the time the operator on duty was D. Brown the calibration parameters are = 1.5643 and = 2.22984 and so on...

Page 57: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 57

Metadata Catalog To run an application that analyzes the data on the Grid

1. Query metadata catalog for LFNs that contain data of interestQ: “Show me files where interferometer was locked and calibration had < 1.6

for GPS times from 714024240 to 714024340”A:

H-R-714024224-16.gwfH-R-714024240-16.gwfH-R-714024256-16.gwfH-R-714024272-16.gwfH-R-714024288-16.gwfH-R-714024304-16.gwfH-R-714024320-16.gwfH-R-714024336-16.gwf

2. Query RLI catalog to find out where those LFNs/files are known about

$ globus-rls-cli query rli lfn H-R-714024224-16.gwf rls://dataserver H-R-714024224-16.gwf: rls://ldas-cit.ligo.caltech.edu:39281

Page 58: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 58

Metadata Catalog3. Query LRC catalog to get URLs for those files of

interest$ globus-rls-cli query lrc lfn H-R-714024224-16.gwf: rls://ldas-

cit.ligo.caltech.edu:39281 H-R-714024224-16.gwf: gsiftp://ldas-cit.ligo.caltech.edu:15000/archive/S1/L0/LHO/H-

R-7140/H-R-714024224-16.gwf

4. Move files from storage to analysis site using GridFTP

globus-url-copy –p 4 gsiftp://ldas-cit.ligo.caltech.edu:15000/archive/S1/L0/LHO/H-R-7140/H-R-714024224-16.gwf gsiftp://hydra.phys.uwm.edu/skoranda/analysis1/H-R-714024224-16.gwf

Page 59: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 59

Summary

Metadata catalog, Globus RLS, and Globus GridFTP provide powerful way to manage data on the Grid and do more science figure out what data/files are needed find it move it do science with it!

Page 60: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 60

But… What about a higher-level tool? We want something that will…

Locate the data Send data to processing sites Share the results with other sites Allocate and de-allocate storage Clean-up everything Do these reliably, efficiently, and without human

supervision

Page 61: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 61

Stork A scheduler for data placement activities in the

Grid What Condor is for computational jobs, Stork is

for data placement Stork comes with a new concept:

“Make data placement a first class citizen in the Grid.”

Page 62: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 62

The Concept

• Stage-in

• Execute the Job

• Stage-out

Stage-in

Execute the job

Stage-outRelease input space

Release output space

Allocate space for input & output data

Individual Jobs

Page 63: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 63

The Concept

• Stage-in

• Execute the Job

• Stage-out

Stage-in

Execute the job

Stage-outRelease input space

Release output space

Allocate space for input & output data

Data Placement Jobs

Computational Jobs

Page 64: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 64

DAGMan

The Concept

CondorJob

QueueDaP A A.submitDaP B B.submitJob C C.submit…..Parent A child BParent B child CParent C child D, E…..

C

StorkJob

Queue

E

DAG specification

A CBD

E

F

Page 65: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 65

Why Stork? Stork understands the characteristics and

semantics of data placement jobs. Can make smart scheduling decisions, for reliable

and efficient data placement. Integrates seamlessly with Condor-G

Page 66: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 66

Failure Recovery and Efficient Resource Utilization

Fault tolerance Just submit a bunch of data placement jobs, and then

go away.. Control number of concurrent transfers from/to

any storage system Prevents overloading

Space allocation and De-allocations Make sure space is available

Page 67: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 67

Support for Heterogeneity

Protocol translation using Stork memory buffer.

Page 68: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 68

Support for Heterogeneity

Protocol translation using Stork Disk Cache.

Page 69: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 69

Flexible Job Representation and Multilevel Policy Support[

Type = “Transfer”;

Src_Url = “srb://ghidorac.sdsc.edu/kosart.condor/x.dat”;

Dest_Url = “nest://turkey.cs.wisc.edu/kosart/x.dat”;

……

……

Max_Retry = 10;

Restart_in = “2 hours”;

]

Page 70: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 70

Run-time Adaptation

Dynamic protocol selection[ dap_type = “transfer”; src_url = “drouter://slic04.sdsc.edu/tmp/test.dat”; dest_url = “drouter://quest2.ncsa.uiuc.edu/tmp/test.dat”; alt_protocols = “nest-nest, gsiftp-gsiftp”;]

[ dap_type = “transfer”; src_url = “any://slic04.sdsc.edu/tmp/test.dat”; dest_url = “any://quest2.ncsa.uiuc.edu/tmp/test.dat”;]

Page 71: June 21-25, 2004Lecture4: Grid Data Management1 Lecture 4 Grid Data Management Jaime Frey UW-Madison Condor Group jfrey@cs.wisc.edu Slides prepared in

June 21-25, 2004 Lecture4: Grid Data Management 71

Run-time Adaptation

Run-time Protocol Auto-tuning[

link = “slic04.sdsc.edu – quest2.ncsa.uiuc.edu”; protocol = “gsiftp”;

bs = 1024KB; //block sizetcp_bs = 1024KB; //TCP buffer sizep = 4;

]