10
16/9/2004 Features of the new CASTOR 1 Features of the new CASTOR Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

Embed Size (px)

Citation preview

Page 1: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 1

Features of the new CASTOR

Alice offline week, 16/9/2004

Olof Bärring, CERN

Page 2: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 2

Outline

• The new stager– architecture– hybrid solution for the ALICE online MDC– status and deployment

• Features for the ALICE online MDC• Features for ALICE physics MDC

Page 3: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 3

Tape mover (RTCOPY)

client daemon

rfiod (disk mover)rfiod (disk mover)

Garbage Collector

New CASTOR Stager Architecture

Migrator

Application

RFIO/stage API

Request Handler Recaller

LSF

CASTOR tape archive components

(VDQM, VMGR, RTCOPY)

Disk cache

rfiod (disk mover)

Resource Management

Interface

Request repository and file catalogue (Oracle or MySQL)

3rd party Policy Engine

mvr cntl

Authentication

file systemload monitoring

Maui

Page 4: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 4

Tape mover (RTCOPY)

client daemon

rfiod (disk mover)rfiod (disk mover)

Not ready for ALICE MDCs

Application

RFIO/stage API

Request Handler

LSF

CASTOR tape archive components

(VDQM, VMGR, RTCOPY)

Disk cache

rfiod (disk mover)

Resource Management

Interface

Request repository and file catalogue (Oracle or MySQL)

3rd party Policy Engine

mvr cntl

Authentication

file systemload monitoring

Maui

Garbage Collector

Migrator

Recaller

Page 5: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 5

Hybrid solution (online MDC)

• Three components not yet implemented– Garbage collector– Migrator– Recaller

• Hybrid solution– A slimmed version of today’s stgdaemon is

used as the framework• Interfaced with new request repository (but not the

file catalogue)• Old recaller/migrator have been interfaced with

expert system for file system selection policies and request repository for submitting the tape requests to the RTCOPY client daemon

Page 6: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 6

stager_castor

Migrator

Recaller

Tape mover (RTCOPY)

client daemon

rfiod (disk mover)rfiod (disk mover)

ALICE MDC stager hybrid

Application

RFIO/stage API

LSF

CASTOR tape archive components

(VDQM, VMGR, RTCOPY)

Disk cache

rfiod (disk mover)

Resource Management

Interface

3rd party Policy Engine

mvr cntl

file systemload monitoring

Maui

Today’s GC script

stgdaemon

Request Handler Request repository

(Oracle or MySQL)

Page 7: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 7

Status and deployment

• The hybrid stager is ready for performance tests– minor problem with retry of failing tape

requests remains to be solved (fixed last night?)

• For this years ALICE MDC there are two parallel setups forseen– development setup for testing out new

features and fixing bugs on the fly– production setup where the MDC runs

• The parallel setups use independent oracle servers for their request catalogues

Page 8: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 8

Test setup

lxs5010• rhserver

lxs5011• rtcpclientd•dlfserver (logging)

lxs5012• stgdaemon•rmmaster•expertd•moab (Maui)

lxshare179d• ORACLE

lxshare027d Disk server• rmnode• rfiod• rootd

lxshare028d Disk server• rmnode• rfiod• rootd

lxshare030d Disk server• rmnode• rfiod• rootd

lxshare031d Disk server• rmnode• rfiod• rootd

lxshare032d Disk server• rmnode• rfiod• rootd

lxshare033d Disk server• rmnode• rfiod• rootd

lxshare034d Disk server• rmnode• rfiod• rootd

Page 9: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 9

Features for ALICE MDC

• Dynamic migration streams– New migration candidates can be added to an

already running stream– In principle a stream could run forever...

• Migration candidate is decided just-in-time– The next file to migrate is selected when the

tape mover is ready to receive the data

• Throttling– Using LSF or Maui allows for throttling when

the system is completely loaded

• rootd (and later xrootd) is the disk mover serving the client

Page 10: 16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN

16/9/2004 Features of the new CASTOR 10

Features for offline MDCs

• Throttling– Using LSF or Maui allows for throttling when the system

is completely loaded

• File catalogue in Oracle will remove the old stgdaemon limit on ~200-500k entries

• rootd (and later xrootd) is the disk mover serving the client

• Disk-to-disk replication of “hot” files• Full-fledged policy system for GC, recalls and

migrations

• Remaining issue: what to do with tape performance for small files?