16
Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting on March 9 th . Main question addressed here: Can we base the PROOF Master coordinator on the XRD framework? Also: Can we take advantage of the XRD load balancer system?

Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Embed Size (px)

Citation preview

Page 1: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Can we use the XROOTD infrastructurein the PROOF context ?

The need and functionality of a PROOF Master coordinatorhas been discussed during the meeting on March 9th.

Main question addressed here:Can we base the PROOF Master coordinator onthe XRD framework?

Also:Can we take advantage of the XRD load balancersystem?

Page 2: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Marek, 9-3-2005

Page 3: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Marek, 9-3-2005

+ [cmd == retrieve]

Page 4: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

How does XROOTD work

• Multi-component server based on a multi-thread architecture

• xrd component: provides networking, thread management, protocol scheduling

• Minimal sets of threads:

• Acceptor: opens connection; matches the protocol; submits job to scheduler• Pollers: react to any activity on open links; submit job to scheduler• Scheduler: schedules work to be done (jobs)• Worker(s): wait for job to be done• Buffer manager: dynamically optimizes use of memory buffers

• Workers created/destroyed following needs

• Links not attached to a specific worker: first worker free takes the job

• Jobs ≡ data/information to be processed for a given link

Page 5: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

How does XROOTD work

accept

WN

schedulerBM

XROOTDjobs

poller

filesProtObj

links

Page 6: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Scheduler

One instance per main process

• new jobs are added to the queue always in last position• can schedule jobs at a later time using a timer

- presently used for internal optimization• can handle forking of external processes

- presently used to handle TNetFile requests forking a rootd process• keeps statistics about all what’s going on

Presently missing• High-priority scheduling (not needed for file serving)

- pollers use poll(): could be implemented using the POLLPRI flag (requires OOB?)- other solutions?- Andy is willing to implement a viable solution.

Page 7: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

XConnections

• Physical connections:- one per client session / host server

- based on TXSocket (: public TSocket)active use of timeouts both in opening and read/write operations

- Reader thread (optional)fills a message queue from where the messages are picked-up

- Handler for unsolicited server messages (requires reader thread)- Multiple-socket foreseen

• Logical connection:- one per open entity (file, …)- can share the same physical connection with another log connection

• Connection manager - keeps list of existing (logical,physical) connections- provides Connect/ Disconnect/ ReadRaw/ WriteRaw functionality- collects garbage

• Connection module (TXNetConn):- runs the xrootd protocol; handles re-directions- this is what the client class TXNetFile uses

Page 8: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Generic protocol interface

Defined by the following methods:

XrdProtocol *Match(XrdLink *lp)invoked when a new link is created to determine if this protocol can handlethe open link

int Process(XrdLink *lp)invoked when a link has data to be processed

void Recycle(XrdLink *lp, int secs, char *reason)invoked when the instance is no longer needed

int Stats(char *buf, int blen, int do_sync)invoked when we need statistics about all instances of the protocol

Page 9: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

Existing protocol implentations

XrdXrootdProtocolimplements the protocol for file serving and directory handling

- login, authentication (validates a physical link) - open, close, read, write, …

- putfile, getfile, rm, mv, stat- mkdir, rmdir, dirlist

XrdRootdProtocolimplements solution for TNetFile backward compatibility

- Match() transfers the open connection by execv(rootd,…)

Page 10: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

XPROOFD: master coordinator

• Cannot run masters in multi-threaded environment- interpreter not MT-safe- crash of one master compromises all the others

• Mixed solution:- run in MT-env the “administrative tasks” (connection handling, logging, collection of results, …) not subject to unexpected bugs in client code- process data in external “job agents” (one per each protocol instance)

• The job agents open a link to the main application to communicate with the related protocol instance, subscribing to the poller(s).

• Caching and saving of the (temporary) results could be handle by protocol instance inside the MT main application: this could help for the privacy of the results

• Jobs agents would be essentially the present proofserv modified for xrd message handling

Page 11: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

XrdProofdProtocol

• Two kind of instances- normal: proofserv instance lifetime uncorrelated from the logical connection starting it- query: get information about the status of submitted jobs lifetime same as connection lifetime

• Normal:- setup (authenticate, …), create the job agent (the job agent would be started via execv by a dedicated thread) - transfer messages from the client to the job agent- transfer messages from the job agent or server to the client- save the results at the end of processing

• Query:- query status of jobs- retrieve IDs of running sessions- retrieve results of terminated sessions

• Specific XProof protocol for handling / structuring messages (analogous of XProtocol.hh)

Page 12: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

TXProofServ (Job Agent)

• Based on TProofServ• At setup, open a XConnection to the MT parent• HandleSocketInput, HandleUrgentData would be part of the UnsolicitedRequestHandler• Other modifications may depend on where we will go with sandboxes

TXSlave

• As TSlave but using TXProofConn

TXProofConn

• Connection module based on the connection manager running the xproofd protocol

XrdProofdProtocol

Page 13: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

How would XPROOFD master coordinator work

accept

WN

schedulerBM

XROOTDjobs

poller

ProtObj

links

jobagent

jobagent

jobagent

Workernodes

client

Page 14: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

XPROOFD: worker node

• Similar structure could be applied to worker nodes

• Ideally one could optimize the load on the existing job agents by making them interchangeable, i.e. not stick to a particular protocol instance. This would require loading of the library environment required in each of the JA instances

accept

WN

schedulerBM

XROOTDjobs

poller

ProtObj

links

jobagent

jobagent

jobagent

master

Page 15: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

OLBD

• In xrootd, control network determining the best server among those having the file where to address the client

• In PROOF it should find out the best subset of worker nodes, among those it knows about, where to start the PROOF session, based on:

- CPU, memory load- expected termination time of ongoing PROOF processing - location of files to be processed- …

• According to Andy changing the policy is pretty easy• Requires worker node registration to masters

Page 16: Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting

To summarize

• Though designed for file serving, the xrd component of xrootd provides an infrastructure and an interface general enough to handle generic server tasks

• Even if the proofservs cannot be run in threads, the general framework seems to provide most of the coordinating functionality we put on the list of desiderata • Andy is at CERN in two weeks and he proposed to have a discussion about this, in particular about the additional levels of abstraction which could be needed to use the framework in other contexts.