19
Scalla/xrootd Scalla/xrootd Introduction Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

Embed Size (px)

Citation preview

Page 1: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

Scalla/xrootd IntroductionScalla/xrootd Introduction

Andrew Hanushevsky, SLAC

SLAC National Accelerator LaboratoryStanford University

6-April-09

ATLAS Western Tier 2 User’s Forum

Page 2: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 2

Outline

File servers NFS & xrootd

How xrootd manages file data Multiple file servers (i.e., clustering) Considerations and pitfalls

Getting to xrootd hosted file data Available programs and interfaces

Page 3: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 3

File Server Types

ApplicationApplication

LinuxLinux

NFS ServerNFS Server

LinuxLinux

NFS ClientNFS Client

Client MachineClient Machine Server MachineServer Machine

AlternativelyAlternatively

xrootd is nothing more than an xrootd is nothing more than an application level file server & clientapplication level file server & client

using another protocolusing another protocol

DataDataFilesFiles

ApplicationApplication

LinuxLinuxLinuxLinuxClient MachineClient Machine Server MachineServer Machine

xroot Serverxroot Server DataDataFilesFiles

xroot Clientxroot Client

Page 4: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 4

Why Not Just Use NFS?

NFS V2 & V3 inadequate Scaling problems with large batch farms Unwieldy when more than one server needed

NFS V4? Relatively new Multiple server support still being vetted

Still has a single point of failure problems

Page 5: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 5

NFS & Multiple File Servers

Which Server?Which Server?

LinuxLinux

NFS ServerNFS Server

Server Machine AServer Machine A

DataDataFilesFilesApplicationApplication

LinuxLinux

NFS ClientNFS Client

Client MachineClient Machine

LinuxLinux

NFS ServerNFS Server

Server Machine BServer Machine B

DataDataFilesFiles

open(“/foo”);open(“/foo”);

NFS can’t naturally deal with this problem.NFS can’t naturally deal with this problem.Typical ad hoc solutions are cumbersome,Typical ad hoc solutions are cumbersome,

restrictive and error prone!restrictive and error prone!

cp /foo /tmp

Page 6: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 6

xrootd & Multiple File Servers I

DataDataFilesFiles

ApplicationApplication

LinuxLinuxClient MachineClient Machine

LinuxLinuxServer Machine BServer Machine B

DataDataFilesFiles

open(“/foo”);open(“/foo”);xroot Clientxroot Client LinuxLinux

Server Machine AServer Machine A

xroot Serverxroot Server

xroot Serverxroot Server

LinuxLinuxServer Machine RServer Machine R

xroot Serverxroot Server

/foo/foo

RedirectorRedirector11

Who has /foo?Who has /foo?

22I do!I do!

33 Try BTry B

44open(“/foo”);open(“/foo”);

xrdcp root://R//foo /tmp

The xroot client does all of theseThe xroot client does all of thesesteps automatically steps automatically

without application (user)without application (user)intervention!intervention!

Page 7: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 7

File Discovery Considerations I

The redirector does not have a catalog of files It always asks each server, and Caches the answers in memory for a “while”

So, it won’t ask again when asked about a past lookup

Allows real-time configuration changes Clients never see the disruption

Does have some side-effects The lookup takes less than a microsecond when files exist Much longer when a requested file does not exist!

Page 8: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 8

xrootd & Multiple File Servers II

DataDataFilesFiles

ApplicationApplication

LinuxLinuxClient MachineClient Machine

LinuxLinuxServer Machine BServer Machine B

DataDataFilesFiles

open(“/foo”);open(“/foo”);xroot Clientxroot Client LinuxLinux

Server Machine AServer Machine A

xroot Serverxroot Server

xroot Serverxroot Server

LinuxLinuxServer Machine RServer Machine R

xroot Serverxroot Server

/foo/foo

RedirectorRedirector11

Who has /foo?Who has /foo?22 Nope!Nope!

55

File deemed not to existFile deemed not to existif there is no responseif there is no response

after 5 seconds!after 5 seconds!

xrdcp root://R//foo /tmp

Page 9: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 9

File Discovery Considerations II

System optimized for “file exists” case! Penalty for going after missing files

Aren’t new files, by definition, missing? Yes, but that involves writing data!

The system is optimized for reading data So, creating a new file will suffer a 5 second delay

Can minimize the delay by using the xprep command Primes the redirector’s file memory cache ahead of time

Can files appear to be missing any other way?

Page 10: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 10

Missing File vs. Missing Server

In xroot files exist to the extent servers exist The redirector cushions this effect for 10 minutes

Afterwards, the redirector cannot tell the difference This allows partially dead server clusters to continue

Jobs hunting for “missing” files will eventually die But jobs cannot rely on files actually being missing

xroot cannot provide a definitive answer to “s: x”

This requires manual safety for file creation

Page 11: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 11

Safe File Creation

Avoiding the basic problem.... Today’s new file may be on yesterday’s dead server

Generally, do not re-use output file names

Otherwise, serialize file creation Use temporary file names when creating new files

E.g., path/....root.temp

Remove temporary to clean-up any previous failures E.g., -f xrdcp option or truncate option on open

Upon success, rename the temporary to its permanent name

Page 12: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 12

Getting to xrootd hosted data

Use the root framework Automatically, when files named root://.... Manually, use TXNetFile() object

Note: identical TFile() object will not work with xrootd!

xrdcp The copy command

xprep The redirector seeder command

Via fuse on atlint01.slac.stanford.eduPOSIX preload library

Page 13: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 13

Copying xrootd hosted data

xrdcp [options] source dest Copies data to/from xrootd servers Some handy options:

-f erase dest before copying source -s stealth mode (i.e., produce no status

messages) -S n use n parallel streams (use only

across WAN)

Page 14: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 14

Preparing xrootd hosted data

xprep [options] host[:port] [path [...]] Prepares xrootd access via redirector host:port

Minimizes wait time if you are creating many files Some handy options:

-w file will be created or written -f fn file fn holds a list of paths, one per

line

Page 15: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 15

Interactive xrootd hosted data

Atlas xroot redirector mounted as a file system “/xrootd” on atlint01.slac.stanford.edu Use this for typical operations

dq2-get dq2-put dq2-ls rm

Page 16: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 16

For Everything Else

POSIX preload library (libXrdPosixPreload.so)

Works with any POSIX I/O compliant program Provides direct access to xrootd hosted data Does not need any changes to the application

Just run the binary as is Talk to Wei or Andy if you want to use it

Page 17: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 17

Conclusion

We hope that this is an effective environment Production Analysis

But, we need your feedback What is unclear What is missing What is not working What can work even better

Page 18: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 18

Future Directions

More simplicity! Integrating the cnsdcnsd into cmsdcmsd

Reduces configuration issues Pre-linking the extended open file system (ofs)

Less configuration options

Tutorial-like guides! Apparent need as we deploy at smaller sites

Page 19: Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum

ATLAS WT2 UF 6-Apr-09 19

Acknowledgements

Software Contributors Alice: Derek Feichtinger CERN: Fabrizio Furano , Andreas Peters Fermi: Tony Johnson (Java)

Root: Gerri Ganis, Beterand Bellenet, Fons Rademakers STAR/BNL: Pavel Jackl SLAC: Jacek Becla, Tofigh Azemoon, Wilko Kroeger LBNL: Alex Sim, Junmin Gu, Vijaya Natarajan (BeStManBeStMan team)

Operational Collaborators BNL, FZK, IN2P3, RAL, UVIC, UTA

Partial Funding US Department of Energy

Contract DE-AC02-76SF00515 with Stanford University