Upload
baldwin-riley
View
214
Download
2
Embed Size (px)
Citation preview
16/9/2004 Features of the new CASTOR 1
Features of the new CASTOR
Alice offline week, 16/9/2004
Olof Bärring, CERN
16/9/2004 Features of the new CASTOR 2
Outline
• The new stager– architecture– hybrid solution for the ALICE online MDC– status and deployment
• Features for the ALICE online MDC• Features for ALICE physics MDC
16/9/2004 Features of the new CASTOR 3
Tape mover (RTCOPY)
client daemon
rfiod (disk mover)rfiod (disk mover)
Garbage Collector
New CASTOR Stager Architecture
Migrator
Application
RFIO/stage API
Request Handler Recaller
LSF
CASTOR tape archive components
(VDQM, VMGR, RTCOPY)
Disk cache
rfiod (disk mover)
Resource Management
Interface
Request repository and file catalogue (Oracle or MySQL)
3rd party Policy Engine
mvr cntl
Authentication
file systemload monitoring
Maui
16/9/2004 Features of the new CASTOR 4
Tape mover (RTCOPY)
client daemon
rfiod (disk mover)rfiod (disk mover)
Not ready for ALICE MDCs
Application
RFIO/stage API
Request Handler
LSF
CASTOR tape archive components
(VDQM, VMGR, RTCOPY)
Disk cache
rfiod (disk mover)
Resource Management
Interface
Request repository and file catalogue (Oracle or MySQL)
3rd party Policy Engine
mvr cntl
Authentication
file systemload monitoring
Maui
Garbage Collector
Migrator
Recaller
16/9/2004 Features of the new CASTOR 5
Hybrid solution (online MDC)
• Three components not yet implemented– Garbage collector– Migrator– Recaller
• Hybrid solution– A slimmed version of today’s stgdaemon is
used as the framework• Interfaced with new request repository (but not the
file catalogue)• Old recaller/migrator have been interfaced with
expert system for file system selection policies and request repository for submitting the tape requests to the RTCOPY client daemon
16/9/2004 Features of the new CASTOR 6
stager_castor
Migrator
Recaller
Tape mover (RTCOPY)
client daemon
rfiod (disk mover)rfiod (disk mover)
ALICE MDC stager hybrid
Application
RFIO/stage API
LSF
CASTOR tape archive components
(VDQM, VMGR, RTCOPY)
Disk cache
rfiod (disk mover)
Resource Management
Interface
3rd party Policy Engine
mvr cntl
file systemload monitoring
Maui
Today’s GC script
stgdaemon
Request Handler Request repository
(Oracle or MySQL)
16/9/2004 Features of the new CASTOR 7
Status and deployment
• The hybrid stager is ready for performance tests– minor problem with retry of failing tape
requests remains to be solved (fixed last night?)
• For this years ALICE MDC there are two parallel setups forseen– development setup for testing out new
features and fixing bugs on the fly– production setup where the MDC runs
• The parallel setups use independent oracle servers for their request catalogues
16/9/2004 Features of the new CASTOR 8
Test setup
lxs5010• rhserver
lxs5011• rtcpclientd•dlfserver (logging)
lxs5012• stgdaemon•rmmaster•expertd•moab (Maui)
lxshare179d• ORACLE
lxshare027d Disk server• rmnode• rfiod• rootd
lxshare028d Disk server• rmnode• rfiod• rootd
lxshare030d Disk server• rmnode• rfiod• rootd
lxshare031d Disk server• rmnode• rfiod• rootd
lxshare032d Disk server• rmnode• rfiod• rootd
lxshare033d Disk server• rmnode• rfiod• rootd
lxshare034d Disk server• rmnode• rfiod• rootd
16/9/2004 Features of the new CASTOR 9
Features for ALICE MDC
• Dynamic migration streams– New migration candidates can be added to an
already running stream– In principle a stream could run forever...
• Migration candidate is decided just-in-time– The next file to migrate is selected when the
tape mover is ready to receive the data
• Throttling– Using LSF or Maui allows for throttling when
the system is completely loaded
• rootd (and later xrootd) is the disk mover serving the client
16/9/2004 Features of the new CASTOR 10
Features for offline MDCs
• Throttling– Using LSF or Maui allows for throttling when the system
is completely loaded
• File catalogue in Oracle will remove the old stgdaemon limit on ~200-500k entries
• rootd (and later xrootd) is the disk mover serving the client
• Disk-to-disk replication of “hot” files• Full-fledged policy system for GC, recalls and
migrations
• Remaining issue: what to do with tape performance for small files?