30
SRM 2.2 Issues Well, er, and 2.3 too Jens Jensen (STFC RAL/GridNet2) On behalf of GSM-WG OGF22, Cambridge, MA

SRM 2.2 Issues Well, er, and 2.3 too

  • Upload
    eudora

  • View
    16

  • Download
    0

Embed Size (px)

DESCRIPTION

SRM 2.2 Issues Well, er, and 2.3 too. Jens Jensen (STFC RAL/GridNet2) On behalf of GSM-WG OGF22, Cambridge, MA. This Talk. Deviates from previous principles of being for beginners Technical Less polished… May be useful for others… Expose standard and protocol process - PowerPoint PPT Presentation

Citation preview

Page 1: SRM 2.2 Issues Well, er, and 2.3 too

SRM 2.2 IssuesWell, er, and 2.3 too

Jens Jensen (STFC RAL/GridNet2)

On behalf of GSM-WG

OGF22, Cambridge, MA

Page 2: SRM 2.2 Issues Well, er, and 2.3 too

This Talk

• Deviates from previous principles of being for beginners– Technical– Less polished…– May be useful for others…

• Expose standard and protocol process– Not many answers – kickstart(restart) process

• Combines the two sessions– Input (mainly) from dCache, CASTOR, StoRM

Page 3: SRM 2.2 Issues Well, er, and 2.3 too

Aims

• Revisit specification– Implementations’ deviations from OGF specifications– Ensure another group can interoperate– If someone else were to start from scratch– E.g. SRB (ASGC work)

• Aim is not to start work on 2.3– I.e. the aim is not – not the aim is to not, not that aim

is not to start– If that makes sense

Page 4: SRM 2.2 Issues Well, er, and 2.3 too

A Very Brief History

• Spec from 2006

• Then came implementations

• Then came WLCG

• …revisit spec

• Now getting experiences

• …revisit spec, highlight issues

• …think about next steps

Page 5: SRM 2.2 Issues Well, er, and 2.3 too

Philosophies

• Manage diverse storage systems (but nothing else)

• User interface (not admin)• Open Standard

– A standard is not a standard until it is a standard (next slide)

• Open participation (no fees, no closed societies)• Protect storage from Grid?• Encourage best practices?• Encourage uniformity? Allow diversity?• The File is the unit of currency (not datasets)

Page 6: SRM 2.2 Issues Well, er, and 2.3 too

Compare OASIS

• “Approved within an OASIS Committee,”• “Submitted for public review,”• “Implemented by at least three

organizations,”• “And finally ratified by the Consortium's

membership at-large.”

• We would add that the three implementations “must interoperate”!

Page 7: SRM 2.2 Issues Well, er, and 2.3 too

WLCG

• Wide deployment

• “Now get experience” with WLCG

• MoU: Significant changes to spec…

• Do they make sense? Process.

• What about smaller customers?

• Tape1Disk1=ONLINE_AND_NEARLINE?– …No. In cache does not mean always in

cache

Page 8: SRM 2.2 Issues Well, er, and 2.3 too

Space Tokens on Get

• srmPrepareToPut uses a space token (description)

• srmPrepareToGet doesn’t– Also for srmBringOnline

• Problem for many implementations– dCache, CASTOR– dCache: MSS doesn’t see space token– StoRM: not needed

Page 9: SRM 2.2 Issues Well, er, and 2.3 too

Other get issues

• Getting directories?– Not supported?– Or special permissions required?– Also to apply for large bulk requests?

Page 10: SRM 2.2 Issues Well, er, and 2.3 too

Finance Use Cases

• Ezio Corso (ICTP/E-Grid) (StoRM)– Compare EGEE industry liaison– “Complexity of financial instruments”– “more stringent risking and reporting

requirements”– “Point solution” grids inefficient (silo)– Big computing makes data bottleneck– Access control by individuals

Page 11: SRM 2.2 Issues Well, er, and 2.3 too

Spaces

• Access Control on spaces– Also to be published in GLUE 1.3 schema as

ACBR on VOInfo

• Reserving subspaces of spaces

• Summarising spaces for Owner

• Query space status?

Page 12: SRM 2.2 Issues Well, er, and 2.3 too

What is a Space Anyway?

• A collection at least one of physical storage component area?

• With a common baseline set of capabilities (access latency etc)?

• Not to even mention “free” space, “used” space, etc.– Tricky to define– Even more tricky to measure– Still more tricky to get agreement

Page 13: SRM 2.2 Issues Well, er, and 2.3 too

What is a Space anyway?

• Is everything a space?– Suggestion to have toplevel static spaces

• Is disk a space? Or can space have disk?• Spaces can be named by token descrs

– Always named by space token descr?– Can be referenced by path? Non-uniquely?– Can be referenced (non-uniquely) by

capabilities?

• Is a (static) space an SA?

Page 14: SRM 2.2 Issues Well, er, and 2.3 too

Space Behaviour

• What happens if a file is released?– Space given back to the Space?– Space does not re-grow?

• Permanent file in limited space?– Used to be: not permitted– Now, space is shrunk and released– Keep token around, or permit recycling?

Page 15: SRM 2.2 Issues Well, er, and 2.3 too

Permissions

• Simple Unixy (POSIX) permissions• Default permissions on directories

– Inheritance from above?– Consistent with space permissions, if

applicable?– Default (per VO?)

• Permit for roles and groups?• Stage in permission (protect write cache)

– Not the same as reading

Page 16: SRM 2.2 Issues Well, er, and 2.3 too

Permissions

• StoRM calls out to LFC– Access control API in SRM not adequate– Use LFC’s API

• Multiple StoRMs can share an LFC

• => Can synchronise between SE and LFC

Page 17: SRM 2.2 Issues Well, er, and 2.3 too

Return Codes

• SRM_REQUEST_QUEUED

• SRM_REQUEST_INPROGRESS

• srmCopy()

Page 18: SRM 2.2 Issues Well, er, and 2.3 too

Use of GSI authentication

• Currently using SOAP over GSI sockets• GSI needed for delegation• Delegation needed for srmCopy() (only)• Incompatible with SSL• Proposal to use gLite delegation

– SOAP API specifically for delegation– AstroGrid uses home-made REST-based

• Not using WS-Anything– Many are Java only, too complex, not mature

Page 19: SRM 2.2 Issues Well, er, and 2.3 too

FileStorageType

• Volatile, Durable, Permanent

• Should have been:

• ReleaseWhenExpired, WarnWhenExpired, NeverExpire– Avoid confusion with overloaded term from

1.1 – wrongly named in spec.

• What is done on Durable/WarnWE timeout? (“raise error condition”)

Page 20: SRM 2.2 Issues Well, er, and 2.3 too

Access Latency

• OFFLINE not defined

• Not used by WLCG

• But does that mean it doesn’t exist?

• ONLINE_AND_NEARLINE mentioned

• LOST…

• UNAVAILABLE…

Page 21: SRM 2.2 Issues Well, er, and 2.3 too

Default

• Certain aspects of API optional– Standard default?– Or implementation-defined default?– E.g., “default” space

• Default filesize on put?– Is it 1?– Is it implementation dependent? Space

dependent?– Is it returned?

Page 22: SRM 2.2 Issues Well, er, and 2.3 too

Implicit

• Implicit pinning• Implicit reservations• Implicit lifetimes• Implicit changes on

action: • Implicit changes on

expiry

• Surprising for users?• Complicates

implementations?• What if permission

denied for implicit action?

• What is reasonable?

Page 23: SRM 2.2 Issues Well, er, and 2.3 too

Explicit but unknown

• Changing spaces (capabilities)– WLCG restricted D1T1 <-> D0T1 (more or

less)

Page 24: SRM 2.2 Issues Well, er, and 2.3 too

Best Practices for Clients

• Propagate errors to user

• Clean up after yourself…– Even after unclean exit

• Should SRM use request timeout and keepalive?– Cancel at any point?– Or only when queueing

Page 25: SRM 2.2 Issues Well, er, and 2.3 too

srmCopy

• Was always slightly tricky (also in 1.0 1.1)• Needs delegation (GSI problem)• How and when does client check status• What if remote host is not an SRM2?• Push modes and pull modes – and firewalls• And then the GridFTP modes (push/pull)• And the GridFTP streams• Can’t always get good results if implementation

uses defaults or tries to guess• No way to set most parameters

Page 26: SRM 2.2 Issues Well, er, and 2.3 too

srmLs problem

• Classical problem with large directories

• Exercise: on a normal filesystem ls -R dir with large directories. While you wait, try to use the system.

• Large data volumes in SOAP– Attachment supported?

• Truncate, offset

Page 27: SRM 2.2 Issues Well, er, and 2.3 too

Which bits are optional…?

• Many features

• Most parameters

• TExtraInfo

Page 28: SRM 2.2 Issues Well, er, and 2.3 too

Next Steps

• Continue this process• Define terminology• Assess “damage”• 2.3

– No, not yet– Too soon, not enough

experience with 2.2– Adaption difficult

Options• Do nothing

– Too late (WLCG)

• Document differences• Retrofit things into 2.2• Add to 2.2

(incremental)• Postpone to “2.3”• Postpone to 3.1

Page 29: SRM 2.2 Issues Well, er, and 2.3 too

Future Stuff

• WSRF– Rich Wellner (2004)– (WSRT?)

• Avoid duplication

• Compare OGSA-D-Arch– Proposes modular architecture for data

Page 30: SRM 2.2 Issues Well, er, and 2.3 too

More Capabilities

• Integrity checking– Act when integrity checking fails?

• Service description, agreement (dynamic)• File content• Data sets, chunks• Dynamic resource allocation

– Networks, additional storage, disk servers (now known as virtualisation)

– Recovery