Upload
britney-morton
View
212
Download
0
Embed Size (px)
Citation preview
xrootd Demonstrator Infrastructure
OSG All Hands MeetingHarvard UniversityMarch 7-11, 2011
Andrew Hanushevsky, SLAC
http://xrootd.org
March 7-11, 2011 2OSG All Hands Meeting
Goals
Describe xrootd architecture configurations Show how these can be used by demos
Alice (in production), Atlas, and CMS
Overview of the File Residency Manager How it addresses file placement
Cover recent and future developmentsConclusion
March 7-11, 2011 3OSG All Hands Meeting
The Motivation
Can we access HEP data as a single repository? Treat it like a Virtual Mass Storage System
Is cache-driven grid data distribution feasible? The last missing file issue (Alice production) Adaptive file placement at Tier 3’s (Atlas demo) Analysis at storage-starved sites (CMS demo)
Does xrootd provide the needed infrastructure?
March 7-11, 2011 4OSG All Hands Meeting
A Simple xrootd Cluster
/my/file /my/file
3: I DO! 3: I DO!
1: open(“/my/file”)4: Try open() at A
5: open(“/my/file”)
Data Servers
Manager(a.k.a. Redirector)Client
cmsdxrootd cmsdxrootd cmsdxrootd
cmsdxrootd
2: Who has “/my/file”?
A B C
March 7-11, 2011 5OSG All Hands Meeting
The Fundamentals
An xrootd-cmsd pair is the building block xrootd provides the client interface
Handles data and redirections cmsd manages xrootd’s (i.e. forms clusters)
Monitors activity and handles file discovery
The building block is uniformly stackable Can build a wide variety of configurations
Much like you would do with LegoÒ blocks Extensive plug-ins provide adaptability
March 7-11, 2011 6OSG All Hands Meeting
Servers
Servers
Servers
Federating xrootd Clusters
1: open(“/my/file”)5: Try open() at ANL
Distributed Clusters
Meta-Manager(a.k.a. Global Redirector)
Client
A B
C
/my/file
/my/filecmsdxrootd cmsdxrootd
cmsdxrootd
A B
C /my/filecmsdxrootd cmsdxrootd
cmsdxrootd
A B
C
/my/file
/my/filecmsdxrootd cmsdxrootd
cmsdxrootd
ANL SLAC UTA
cmsdxrootd
A
B
C3: Who has “/my/file”? 3: Who has “/my/file”? 3: Who has “/my/file”?
8: open(“/my/file”)
4: I DO! 4: I DO!
cmsdxrootdManager
(a.k.a. Local Redirector)Manager
(a.k.a. Local Redirector)Manager
(a.k.a. Local Redirector)
cmsdxrootd
6: open(“/my/file”)
7: Try open() at A
cmsdxrootd
2: Who has “/my/file”?
But I’m behind a firewall!Can I still play?
Data is uniformly availablefrom three distinct sites
/my/file
An exponentially parallel search!(i.e. O(2n))
March 7-11, 2011 7OSG All Hands Meeting
Firewalls & xrootd
xrootd is a very versatile system It can be a server, manager, or supervisor
Desires are all specified in a single configuration file
libXrdPss.so plug-in creates an xrootd chameleon Allows xrootd to be a client to another xrootd
So, all the basic roles can run as proxies Transparently getting around fire-walls
Assuming you run the proxy role on a border machine
March 7-11, 2011 8OSG All Hands Meeting
Border Machines
A Simple xrootd Proxy Cluster
/my/file /my/file
6: I DO! 6: I DO!
1: open(“/my/file”)
3: open(“/my/file”)
Data Servers
Manager(a.k.a. Redirector)
Client
cmsdxrootd cmsdxrootd cmsdxrootd
A B C
4: open(“/my/file”)
Firewall
7: Try open() at A
8: o
pen(
“/m
y/fil
e”)
Proxy Serverscmsdxrootd cmsdxrootdX Y
2: Try open() at X
Proxy Manager(a.k.a. Proxy Redirector)cmsdxrootd
5: Who has “/my/file”?
cmsdxrootd
Proxy Managers Can FederateWith a Meta-Manager
How does help in aFederated cluster?
March 7-11, 2011 9OSG All Hands Meeting
Demonstrator Specific Features
A uniform file access infrastructure Usable even in the presence of firewalls
Access to files across administrative domains Each site can enforce its own rules
Site participation proportional to scalability Essentially the bit-torrent social model
Increased opportunities for HEP analysis A foundation for novel approaches to efficiency
March 7-11, 2011 10OSG All Hands Meeting
Alice & Atlas Approach
Real-time placing of files at a site Built on top of the File Residency Manager (FRM)
FRM - xrootd service that controls file residency Locally configured to handle events such as
A requested file is missing A file is created or an existing file is modified Disk space is getting full
Alice uses an “only when necessary” modelAtlas will use a “when analysis demands” model
March 7-11, 2011 OSG All Hands Meeting 11
Using FRM For File Placement
xrootd frm_xfrd
TransferQueue
Configuration File
Client
xrootd Data Server
RemoteStorage
all.export /atlas/atlasproddisk stagefrm.xfr.copycmd in /opt/xrootd/bin/xrdcp \ –f –np root://globalredirector/$SRC $DST
1open(missing_file)
2
Insert xfrrequest
3Tell client wait
4
Read xfrrequest
TransferAgent
5Launch xfr agent
7Notify xrootd OK
6Copy in file
Wakeup client8
dq2getglobus-url-copy
gridFTPscp
wgetxrdcp
etc
March 7-11, 2011 12OSG All Hands Meeting
FRM Even Works With Firewalls
xrdcp2
Read xfr request
3ssh xfr agent
5 Notify xrootd to run client
frm_xfrdTransferQueue
xrootd Data Server
xrootd Big Bad Internet
Border Machine
4Copy in file
1Write xfr request
Need to setup ssh identity keys
● The FRM needs one or more border machines
● The server transfer agent simply launches the real agent across the border
● How it’s donefrm.xfr.copycmd in noalloc ssh bordermachine /opt/xrootd/bin/xrdcp –f \
root://globalredirector/$LFN root://mynode/$LFN?ofs.posc=1
Firewall
March 7-11, 2011 13OSG All Hands Meeting
Storage-Starved Sites (CMS)
Provide direct access to missing files This is basically a freebie of the system
However, latency issues exist Naively, as much as 3x increase in wall-clock time
Can be as low as 5% depending on job’s CPU/IO ratio• The root team is aggressively working to reduce it
On the other hand. . . May be better than not doing analysis at such sites
No analysis is essentially infinite latency
March 7-11, 2011 14OSG All Hands Meeting
Security
xrootd supports needed security models Most notably grid certificates (GSI)
Human cost needs to be considered Does read-only access require this level of security?
Considering that the data is unusable without a framework
Each deployment faces different issues Alice uses light-weight internal security Atlas will use server-to-server certificates CMS will need to deploy the full grid infrastructure
March 7-11, 2011 15OSG All Hands Meeting
Recent Developments
FS-Independent Extended Attribute Framework Used to save file-specific information
Migration time, residency requirements, checksums
Shared-Everything File System Support Optimize file discovery in distributed file systems
dCache, DPM, GPFS, HDFS, Lustre, proxy xrootd
Meta-Manager throttling Configurable per-site query limits
March 7-11, 2011 16OSG All Hands Meeting
Future Major Developments
Integrated checksums Inboard computation, storage, and reporting
Outboard computation already supported
Specialized Meta-Manager Allows many more subscriptions than today
Internal DNS caching and full IPV6 supportAutomatic alerts Part of message and logging restructuring
March 7-11, 2011 17OSG All Hands Meeting
Conclusion
xrootd mates well with demo requirements Can federated almost any file system Gives a uniform view of massive amounts of data
Assuming per-experiment common logical namespace Secure and firewall friendly Ideal platform for adaptive caching systems Completely open source under a BSD license
See more at http://xrootd.org/
March 7-11, 2011 18OSG All Hands Meeting
AcknowledgementsCurrent Software Contributors ATLAS: Doug Benjamin CERN: Fabrizio Furano, Lukasz Janyst, Andreas Peters, David Smith Fermi/GLAST: Tony Johnson FZK: Artem Trunov LBNL: Alex Sim, Junmin Gu, Vijaya Natarajan (BeStMan team) Root: Gerri Ganis, Beterand Bellenet, Fons Rademakers OSG: Tim Cartwright, Tanya Levshina SLAC: Andrew Hanushevsky, Wilko Kroeger, Daniel Wang, Wei Yang UNL: Brian Bockelman UoC: Charles Waldman
Operational Collaborators ANL, BNL, CERN, FZK, IN2P3, SLAC, UTA, UoC, UNL, UVIC, UWisc
US Department of Energy Contract DE-AC02-76SF00515 with Stanford University