14
Status Status GridKa GridKa & & ALICE T2 ALICE T2 in Germany in Germany Kilian Schwarz Kilian Schwarz GSI Darmstadt GSI Darmstadt

Status GridKa & ALICE T2 in Germany

Embed Size (px)

DESCRIPTION

Status GridKa & ALICE T2 in Germany. Kilian Schwarz GSI Darmstadt. ALICE T2. Present status Plans and timelines Issues and problems. Status GridKa. Pledged: 600 KSI2k, delivered: 133%, 11% of ALICE jobs (last month). FZK. CERN. GridKa – main issue. - PowerPoint PPT Presentation

Citation preview

Page 1: Status  GridKa & ALICE T2 in Germany

Status Status GridKaGridKa

&&ALICE T2ALICE T2

in Germanyin Germany

Kilian SchwarzKilian Schwarz

GSI DarmstadtGSI Darmstadt

Page 2: Status  GridKa & ALICE T2 in Germany

ALICE T2ALICE T2

• Present statusPresent status

• Plans and timelinesPlans and timelines

• Issues and Issues and problemsproblems

Page 3: Status  GridKa & ALICE T2 in Germany

Status GridKaStatus GridKa• Pledged: 600 KSI2k, delivered: 133%, 11% Pledged: 600 KSI2k, delivered: 133%, 11%

of ALICE jobs (last month)of ALICE jobs (last month)

FZK

CERN

Page 4: Status  GridKa & ALICE T2 in Germany

GridKa – main issueGridKa – main issue• Resources provided according to Resources provided according to

megatablemegatable– The share among Tier1s comes automatically The share among Tier1s comes automatically

when considering the Tier2s connecting to when considering the Tier2s connecting to this Tier1 …this Tier1 …

– GridKa pledges 2008: tape 1.5 PB, disk 1 PBGridKa pledges 2008: tape 1.5 PB, disk 1 PB– Current megatable: tape 2.2 PB !!! Current megatable: tape 2.2 PB !!!

Much more than pledged, more than all other Much more than pledged, more than all other experiments together, most of the additional experiments together, most of the additional demand due to the Russian T2 (0.8 PB)demand due to the Russian T2 (0.8 PB)

The point is: the money is fixed. In principle switch between tape/disk/cpu should be possible – not on short

notice, though. Eventually for 2009 things still can be changed.

Page 5: Status  GridKa & ALICE T2 in Germany

ALICE T2 – present ALICE T2 – present statusstatus

vobox

LCG RB/CE

GSI Batchfarm (39

nodes/252 cores for

ALICE) & GSIAF(14

nodes)Directly attached disk

storage (55 TB)

ALICE::GSI::SE_tactical

::xrootd

30 TB + 120

ALICE::GSI::SE::xrootd

PROOF/

Batch

Grid

CERN

GridKa

150 Mbps

GSI

Page 6: Status  GridKa & ALICE T2 in Germany

Present StatusPresent Status •ALICE::GSI:SE::xrootdALICE::GSI:SE::xrootd

•> 30 TB disk on fileserver (8 FS a 4 TB each)> 30 TB disk on fileserver (8 FS a 4 TB each)

•+ 120 TB disk on fileserver+ 120 TB disk on fileserver–20 fileserver 3U 15*500 GB disks RAID 520 fileserver 3U 15*500 GB disks RAID 5–6 TB user space per server6 TB user space per server

•Batch Farm/GSIAF and ALICE::GSI::SE_tactical::xrootdBatch Farm/GSIAF and ALICE::GSI::SE_tactical::xrootd

nodes dedicated to ALICE:nodes dedicated to ALICE:

•15 D-Grid funded boxes: each 15 D-Grid funded boxes: each –2*2core 2.67 GHz Xeon, 8 GB RAM2*2core 2.67 GHz Xeon, 8 GB RAM–2.1 TB local disk space on 3 disks + system disk2.1 TB local disk space on 3 disks + system diskAdditionally 24 new boxes: each Additionally 24 new boxes: each

–2*4core 2.67 GHz Xeon, 16 GB RAM2*4core 2.67 GHz Xeon, 16 GB RAM–2.0 TB local disk space on 4 disks including system2.0 TB local disk space on 4 disks including system

Page 7: Status  GridKa & ALICE T2 in Germany

ALICE T2 – short term ALICE T2 – short term plans plans

• Extend GSIAF to all 39 nodesExtend GSIAF to all 39 nodes

• Study coexistence of interactive and Study coexistence of interactive and batch processes on the same machines. batch processes on the same machines. Develop possibility to increase/decrease Develop possibility to increase/decrease the number of batch jobs on the fly to the number of batch jobs on the fly to give advantage to analysis.give advantage to analysis.

• Add newly bought fileservers (about 120 Add newly bought fileservers (about 120 TB disk space) to ALICE::LCG::SE::xrootdTB disk space) to ALICE::LCG::SE::xrootd

Page 8: Status  GridKa & ALICE T2 in Germany

ALICE T2 – medium term ALICE T2 – medium term plansplans• Add 25 additional nodes to GSI Add 25 additional nodes to GSI

Batchfarm/GSIAF to be financed via 3rd Batchfarm/GSIAF to be financed via 3rd party project (D-Grid)party project (D-Grid)

• Upgrade GSI network connection to 1 Upgrade GSI network connection to 1 Gbs either as dedicated line to GridKa Gbs either as dedicated line to GridKa (direct T2 connection to T0 problematic) (direct T2 connection to T0 problematic) or as general internet connectionor as general internet connection

Page 9: Status  GridKa & ALICE T2 in Germany

ALICE T2 – ramp up plansALICE T2 – ramp up plans  http://lcg.web.cern.ch/LCG/C-RRB/http://lcg.web.cern.ch/LCG/C-RRB/MoU/WLCGMoU.pdfMoU/WLCGMoU.pdf

Page 10: Status  GridKa & ALICE T2 in Germany

Plans for the Alice Tier 2&3 at GSI: Plans for the Alice Tier 2&3 at GSI:

•Remarks:Remarks:

•2/32/3 of that capacity is for the of that capacity is for the tier 2tier 2 (ALICE central, fixed via WLCG (ALICE central, fixed via WLCG MoU)MoU)

•1/31/3 for the for the tier 3tier 3 (local usage, may be used via Grid) (local usage, may be used via Grid)

•according to the Alice computing model no tape for tier2according to the Alice computing model no tape for tier2

•tape for tier3 independent of MoUtape for tier3 independent of MoU

•hi run in October -> hi run in October -> upgrade operational: 3Qupgrade operational: 3Q each year each year

YearYear 20072007 20082008 20092009 20102010 20112011

ramp-upramp-up 0.40.4 1.01.0 1.31.3 1.71.7 2.22.2

CPU CPU (kSI2k)(kSI2k)

400/26400/2600

1000/1000/

660 660 1300/1300/

8608601700/1700/

1100110022002200

Disk (TB)Disk (TB) 120/80120/80 300/20300/2000

390/26390/2600

510/34510/3400

660660

WAN WAN (Mb/s)(Mb/s)

100100 10001000 10001000 10001000 ......

Page 11: Status  GridKa & ALICE T2 in Germany

ALICE T2/T3ALICE T2/T3

• remarks related to ALICE T2/3:remarks related to ALICE T2/3:– At T2 centres are the Physicists who know At T2 centres are the Physicists who know

what they are doingwhat they are doing– Analysis can be prototyped in a fast way Analysis can be prototyped in a fast way

with the experts close bywith the experts close by– GSI requires flexibility for optimising the GSI requires flexibility for optimising the

ratio of calibration/analysis & simulation at ratio of calibration/analysis & simulation at tier2/3tier2/3

Language definition according to GSI interpretation:

ALICE T2: central useALICE T2: central useALICE T3: local use. Resources ALICE T3: local use. Resources may be used via Grid. But no may be used via Grid. But no pledged resources.pledged resources.

Page 12: Status  GridKa & ALICE T2 in Germany

data transfers CERN GSIdata transfers CERN GSI

• motivation: calibration modell motivation: calibration modell and algorithms need to be and algorithms need to be tested before Octobertested before October

• test the functionality of current test the functionality of current T0/T1 T0/T1 T2 transfer methods. T2 transfer methods.

• At GSI the CPU and storage At GSI the CPU and storage resources are available, but how resources are available, but how do we bring the data here ?do we bring the data here ?

Page 13: Status  GridKa & ALICE T2 in Germany

data transfer CERN GSIdata transfer CERN GSI

• The system is not ready yet for generic use. Therefore expert control by a The system is not ready yet for generic use. Therefore expert control by a „mirror master“@CERN is necessary.„mirror master“@CERN is necessary.

• In principle: individual file transfer works fine, now. Plan: next transfers In principle: individual file transfer works fine, now. Plan: next transfers with Pablos new collections based commands. Webpage where transfer with Pablos new collections based commands. Webpage where transfer requests can be entered and transfer status can be followed up.requests can be entered and transfer status can be followed up.

• So far about 700 ROOT files have been successfully transfered. This So far about 700 ROOT files have been successfully transfered. This corresponds to about 1 TB of data.corresponds to about 1 TB of data.

• 30% of the newest request still pending.30% of the newest request still pending.• Maximum speed achieved so far: 15 MB/s (almost complete bandwidth of Maximum speed achieved so far: 15 MB/s (almost complete bandwidth of

GSI), but only during a relatively short timeGSI), but only during a relatively short time

• Since August 8 no relevant transfers anymore. Reasons:Since August 8 no relevant transfers anymore. Reasons:– August 8, pending xrootd update at Castor SEAugust 8, pending xrootd update at Castor SE– August 14, GSI SE failure due to network problemsAugust 14, GSI SE failure due to network problems– August 20, instability of central AliEn services. Production comes firstAugust 20, instability of central AliEn services. Production comes first-- Up to recently: AliEn update-- Up to recently: AliEn update

• GSI plans to analyse the transferred data ASAP and to continue with more GSI plans to analyse the transferred data ASAP and to continue with more transfers. Also PDC data need to be transferred for prototyping and transfers. Also PDC data need to be transferred for prototyping and testing of analysis code.testing of analysis code.

Page 14: Status  GridKa & ALICE T2 in Germany

data transfer CERN GSIdata transfer CERN GSI