18
xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC http://xrootd.org

Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

Embed Size (px)

Citation preview

Page 1: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

xrootd Demonstrator Infrastructure

OSG All Hands MeetingHarvard UniversityMarch 7-11, 2011

Andrew Hanushevsky, SLAC

http://xrootd.org

Page 2: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 2OSG All Hands Meeting

Goals

Describe xrootd architecture configurations Show how these can be used by demos

Alice (in production), Atlas, and CMS

Overview of the File Residency Manager How it addresses file placement

Cover recent and future developmentsConclusion

Page 3: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 3OSG All Hands Meeting

The Motivation

Can we access HEP data as a single repository? Treat it like a Virtual Mass Storage System

Is cache-driven grid data distribution feasible? The last missing file issue (Alice production) Adaptive file placement at Tier 3’s (Atlas demo) Analysis at storage-starved sites (CMS demo)

Does xrootd provide the needed infrastructure?

Page 4: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 4OSG All Hands Meeting

A Simple xrootd Cluster

/my/file /my/file

3: I DO! 3: I DO!

1: open(“/my/file”)4: Try open() at A

5: open(“/my/file”)

Data Servers

Manager(a.k.a. Redirector)Client

cmsdxrootd cmsdxrootd cmsdxrootd

cmsdxrootd

2: Who has “/my/file”?

A B C

Page 5: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 5OSG All Hands Meeting

The Fundamentals

An xrootd-cmsd pair is the building block xrootd provides the client interface

Handles data and redirections cmsd manages xrootd’s (i.e. forms clusters)

Monitors activity and handles file discovery

The building block is uniformly stackable Can build a wide variety of configurations

Much like you would do with LegoÒ blocks Extensive plug-ins provide adaptability

Page 6: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 6OSG All Hands Meeting

Servers

Servers

Servers

Federating xrootd Clusters

1: open(“/my/file”)5: Try open() at ANL

Distributed Clusters

Meta-Manager(a.k.a. Global Redirector)

Client

A B

C

/my/file

/my/filecmsdxrootd cmsdxrootd

cmsdxrootd

A B

C /my/filecmsdxrootd cmsdxrootd

cmsdxrootd

A B

C

/my/file

/my/filecmsdxrootd cmsdxrootd

cmsdxrootd

ANL SLAC UTA

cmsdxrootd

A

B

C3: Who has “/my/file”? 3: Who has “/my/file”? 3: Who has “/my/file”?

8: open(“/my/file”)

4: I DO! 4: I DO!

cmsdxrootdManager

(a.k.a. Local Redirector)Manager

(a.k.a. Local Redirector)Manager

(a.k.a. Local Redirector)

cmsdxrootd

6: open(“/my/file”)

7: Try open() at A

cmsdxrootd

2: Who has “/my/file”?

But I’m behind a firewall!Can I still play?

Data is uniformly availablefrom three distinct sites

/my/file

An exponentially parallel search!(i.e. O(2n))

Page 7: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 7OSG All Hands Meeting

Firewalls & xrootd

xrootd is a very versatile system It can be a server, manager, or supervisor

Desires are all specified in a single configuration file

libXrdPss.so plug-in creates an xrootd chameleon Allows xrootd to be a client to another xrootd

So, all the basic roles can run as proxies Transparently getting around fire-walls

Assuming you run the proxy role on a border machine

Page 8: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 8OSG All Hands Meeting

Border Machines

A Simple xrootd Proxy Cluster

/my/file /my/file

6: I DO! 6: I DO!

1: open(“/my/file”)

3: open(“/my/file”)

Data Servers

Manager(a.k.a. Redirector)

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

4: open(“/my/file”)

Firewall

7: Try open() at A

8: o

pen(

“/m

y/fil

e”)

Proxy Serverscmsdxrootd cmsdxrootdX Y

2: Try open() at X

Proxy Manager(a.k.a. Proxy Redirector)cmsdxrootd

5: Who has “/my/file”?

cmsdxrootd

Proxy Managers Can FederateWith a Meta-Manager

How does help in aFederated cluster?

Page 9: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 9OSG All Hands Meeting

Demonstrator Specific Features

A uniform file access infrastructure Usable even in the presence of firewalls

Access to files across administrative domains Each site can enforce its own rules

Site participation proportional to scalability Essentially the bit-torrent social model

Increased opportunities for HEP analysis A foundation for novel approaches to efficiency

Page 10: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 10OSG All Hands Meeting

Alice & Atlas Approach

Real-time placing of files at a site Built on top of the File Residency Manager (FRM)

FRM - xrootd service that controls file residency Locally configured to handle events such as

A requested file is missing A file is created or an existing file is modified Disk space is getting full

Alice uses an “only when necessary” modelAtlas will use a “when analysis demands” model

Page 11: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 OSG All Hands Meeting 11

Using FRM For File Placement

xrootd frm_xfrd

TransferQueue

Configuration File

Client

xrootd Data Server

RemoteStorage

all.export /atlas/atlasproddisk stagefrm.xfr.copycmd in /opt/xrootd/bin/xrdcp \ –f –np root://globalredirector/$SRC $DST

1open(missing_file)

2

Insert xfrrequest

3Tell client wait

4

Read xfrrequest

TransferAgent

5Launch xfr agent

7Notify xrootd OK

6Copy in file

Wakeup client8

dq2getglobus-url-copy

gridFTPscp

wgetxrdcp

etc

Page 12: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 12OSG All Hands Meeting

FRM Even Works With Firewalls

xrdcp2

Read xfr request

3ssh xfr agent

5 Notify xrootd to run client

frm_xfrdTransferQueue

xrootd Data Server

xrootd Big Bad Internet

Border Machine

4Copy in file

1Write xfr request

Need to setup ssh identity keys

● The FRM needs one or more border machines

● The server transfer agent simply launches the real agent across the border

● How it’s donefrm.xfr.copycmd in noalloc ssh bordermachine /opt/xrootd/bin/xrdcp –f \

root://globalredirector/$LFN root://mynode/$LFN?ofs.posc=1

Firewall

Page 13: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 13OSG All Hands Meeting

Storage-Starved Sites (CMS)

Provide direct access to missing files This is basically a freebie of the system

However, latency issues exist Naively, as much as 3x increase in wall-clock time

Can be as low as 5% depending on job’s CPU/IO ratio• The root team is aggressively working to reduce it

On the other hand. . . May be better than not doing analysis at such sites

No analysis is essentially infinite latency

Page 14: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 14OSG All Hands Meeting

Security

xrootd supports needed security models Most notably grid certificates (GSI)

Human cost needs to be considered Does read-only access require this level of security?

Considering that the data is unusable without a framework

Each deployment faces different issues Alice uses light-weight internal security Atlas will use server-to-server certificates CMS will need to deploy the full grid infrastructure

Page 15: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 15OSG All Hands Meeting

Recent Developments

FS-Independent Extended Attribute Framework Used to save file-specific information

Migration time, residency requirements, checksums

Shared-Everything File System Support Optimize file discovery in distributed file systems

dCache, DPM, GPFS, HDFS, Lustre, proxy xrootd

Meta-Manager throttling Configurable per-site query limits

Page 16: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 16OSG All Hands Meeting

Future Major Developments

Integrated checksums Inboard computation, storage, and reporting

Outboard computation already supported

Specialized Meta-Manager Allows many more subscriptions than today

Internal DNS caching and full IPV6 supportAutomatic alerts Part of message and logging restructuring

Page 17: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 17OSG All Hands Meeting

Conclusion

xrootd mates well with demo requirements Can federated almost any file system Gives a uniform view of massive amounts of data

Assuming per-experiment common logical namespace Secure and firewall friendly Ideal platform for adaptive caching systems Completely open source under a BSD license

See more at http://xrootd.org/

Page 18: Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC

March 7-11, 2011 18OSG All Hands Meeting

AcknowledgementsCurrent Software Contributors ATLAS: Doug Benjamin CERN: Fabrizio Furano, Lukasz Janyst, Andreas Peters, David Smith Fermi/GLAST: Tony Johnson FZK: Artem Trunov LBNL: Alex Sim, Junmin Gu, Vijaya Natarajan (BeStMan team) Root: Gerri Ganis, Beterand Bellenet, Fons Rademakers OSG: Tim Cartwright, Tanya Levshina SLAC: Andrew Hanushevsky, Wilko Kroeger, Daniel Wang, Wei Yang UNL: Brian Bockelman UoC: Charles Waldman

Operational Collaborators ANL, BNL, CERN, FZK, IN2P3, SLAC, UTA, UoC, UNL, UVIC, UWisc

US Department of Energy Contract DE-AC02-76SF00515 with Stanford University