22
1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and analyses Monte Carlo Data taking in Y2002 Summary C.Bloise June,24 th

1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

Embed Size (px)

Citation preview

Page 1: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

1

Status Report on KLOE Data ReconstructionJune 2002

Offline farm

Data taking in Y2000-1Data ReconstructionReprocessing in Y2002

DST production and analyses

Monte Carlo

Data taking in Y2002

Summary

C.Bloise June,24th

Page 2: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

2

Page 3: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

3

Offline Farm

CPUs 10x4 E4500 User/MonteCarlo/DST - 1999 23x4 H80 User/Reconstruction - 2001

Servers 2 H80 TSM(Library) and NFS(Disk) Servers - 2001Disks 3TB - 1999-2001

Servers 2 H70 AFS Servers - 2000Disks 2TB - 2000

Library Magstar 110TB (5500 cassettes) - 1999 220TB - 2000 +6 drives - 2002

Router Catalyst 6000 Gigabit/FE - 2002

Page 4: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

4

Computing & Data Storage 2002

kruncRun Control &

DFC

Online farm fibm04

.... fibm08

Offline farm

19 IBM B80 76 CPU 2.5 KHz : total farm power

8 Enterprise 450 32 CPU MonteCarlo DSTs

4 IBM + 2 Sun Users

Reconstruction output 560 GB Job control areas 200 GBRecalled Data from Tape 2200 GB

Magstar 5,500 Cassettes 12 Tape Units 220TB

fibm01DB2 Server

2 IBM H80 TSM/NFS Servers

2 IBM H70 AFS Servers 2 TByte

1.4 TByte

Catalyst 6000

Page 5: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

5

Readout

FDDI

OnlineFarm1

OnlineFarm3

DBserver

CISCO

Fast Eth. channel

Giga Eth. channelServer1Server1 Server2Server2

Tape Library0.22 PBytes6

drives

6drives

Offline farm: 33 servers 134 CPUs

33 Fast Eth.

Fast Eth.

Analysis farm- Servers and PC on AFS

Page 6: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

6

Raccolta Dati del 2000-2001

• Presa dati contemporanea allo studio della macchina

• I 180 pb-1 sono stati raccolti insieme a fondi-macchina diversi per entità e topologia

• Il costo complessivo per pb-1 in termini di CPU e spazio sta decrescendo

April Jun-Jul Sep-Oct Nov-Dec

Average Luminosity b-1 s-1

Page 7: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

7

180 pb-1 collected in different run periods, at different luminosities and background levels

Data Taking Period

Max

luminosity

(cm-2 s-1)

Sample Luminosity

( pb-1)

Maximum reconstructable

luminosity

(pb-1/day)

Data volume

(Gbytes/pb-1)

Nov-Dec 01 5 1031 51 3.5 400

Sep-Oct 01 2.8 1031 51 2.5 480

Jun-Jul 01 2.0 1031 58 0.9 600

00- Apr 01 1.0 1031 22 0.5 1200

Y2000 and Y2001 Data set -

CPU power cost and data volumes stored for different data taking conditions

Different run conditions imply

different CPU power needed per pb-1

different data volumes stored per pb- 1

Data Taking in Y2001

Page 8: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

8

Riprocessamento dei dati del 2001

• La disomogeneità della presa dati ha anche evidenziato la fragilità di alcuni criteri, in generale quelli basati sulle “occupancy” dei rivelatori, nel filtrare i fondi-macchina.

• Alla fine del 2000 sono stati sviluppati e provati criteri più robusti per la selezione dei dati

• La calibrazione del calorimetro è stata modificata in Ottobre, includendo una determinazione più precisa degli offsets dei TDCs

• Una procedura globalmente correttiva per i dati presi precedentemente è stata messa a punto

• La reiezione dei muoni prescalati dal trigger “cosmico” è stata rimossa per consentire una determinazione più precisa delle efficienze per eventi

• Si sono aggiunti nuovi algoritmi per isolare eventi kS , kL , kL .

• I criteri per selezionare i kaoni carichi sono stati completamente rivisti.

Page 9: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

9

Reconstruction - Y2002

EmC recon.

DC recon. Evt. Class

bha

kpm

ksl

rpi

rad

clb

flt

afl

MBcosmic

÷100

L3 filtercosmic

raw

L2 Triggers

Page 10: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

10

Riprocessamento dei dati del 2001

Alla fine dell’anno si è deciso di riprocessare i dati del 2001• Per avere alla fine dati omogenei da analizzare, che fossero

– selezionati tutti con i criteri più robusti e

– più precisamente calibrati

• Per includere i nuovi criteri di selezione, specialmente quelli riguardanti i kaoni carichi, maggiormente modificati

• Per migliorare la misura della luminosità

• Il riprocessamento dei dati è partito a metà gennaio e si è concluso in maggio : sono stati analizzati 17 miliardi di triggers.

Page 11: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

11

pb-1/day

KHz

pb-1/100Mtri

Feb,17th May, 7th

Day by day Data Reprocessing

Page 12: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

12

MB/s in un singolo “retrieve” da tape in un singolo “archive”

Tapes “montati” in funzione del tempo

Page 13: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

13

Attivita’ su tre filesystems definiti su un server: /datarec output dei processi di ricostruzione off-line /recall1 e /recall2 areas di “retrieve”

Stati idle/wait del server in %

Page 14: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

14

Besides the reconstructed data sample, the DSTs are produced, with a selected information content reducing by a factor of 10 the data volume for the analyses.

These files are created immediately after the data reconstruction completion to take advantage of the availability of the files on disk.

Excluding the charged kaon DSTs, that have not been created so far, we have a total of 6 GB/pb-1 as DST volume.

The charged kaon DSTs will have both, the biggest event size, and the largest event sample: 106 events/pb-1 X 5 KBytes/event, e.g. 5 GBytes/ pb-1

1.4 TBytes of disk space devoted now to DSTs on the new servers.

To allow an efficient multi-user access to the entire DST sample (6 TBytes by the end of data taking) we need to increase disk space to 4 Tbytes by the end fo the year

DSTs for the analyses

Page 15: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

15

DSTs

La creazione dei DSTs può comportare o meno una ricostruzione specializzata dell’evento.

Per ottenere un rate di produzione di 20-25 pb-1 per giorno occorrono

5 CPUs dedicate nel caso si tratti di semplice filtering di eventi 40 CPUs nel caso si debba per esempio ricostruire l’evento Il throughput di dati verso le CPUs è in questi casi tra i 5-7 MB/s

Page 16: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

16

Analisi

Il flusso dei dati verso i PC degli utenti per le analisi passa in KLOE attraverso la creazione dei DSTs, che vorremmo per la maggior parte disponibili su disco, accessibili dai filtri specializzati per la creazione del campione di dati utili alle diverse analisi.

Il filtro lavora sulle macchine centrali e tiene “temporaneamente” le sue uscite su AFS,attualmente di 2Tbytes.E’ possibile che ulteriore spazio venga richiesto dai gruppi di lavoro nel corso dell’anno (nuovi lavori sui kaoni carichi,...).

Questi filtri richiedono 1-2 Mbytes/s e possono processare 1-4 pb-1 /CPU/hour

Per ogni evento filtrato vengono ritenute circa 5Kbytes.

Page 17: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

17

Monte Carlo

• 8 Sun-Enterprise 450 are dedicated to Monte Carlo generation and reconstruction

• Event production is based on procedures based on the information (random seeds, input cards, job status ,…) in the DB

• Computer power correspond to 2.5 4 106 events per day

• The most demanding task is the study of the background topologies

• Work in progress– 12 106 of decays ( 4 pb-1) ~70% available

– 49 106 of KS decays (73 pb-1) ~70% available

– 25 104 of KS decays

– 15 105 of K decays (44 pb-1) available

– 5 105 of decays available

– 1 106 of e+e- e+e- available

Page 18: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

18

Monte Carlo

• Anche per il MonteCarlo verranno creati dei DSTs

• Il campione sarà presumibilmente di qualche centinaio di milione di eventi, con 10-15 Kbytes/evento

• 1.2 Tbytes/108 eventi è lo spazio richiesto per rendere disponibile alle analisi questi campioni

• 2 MB/s verranno richiesti in lettura da ciascun processo

• Circa 200,000 eventi possono esseri processati in un’ora.

Page 19: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

19

The Offline farm can process 3,000 triggers/s (>6 pb -1 / day )

Y2002 Data : May 3rd – June 4th : Trigger rate 1.4 KHz (Level 3)

0.03 nb-1 s-1 270 Bhabha s -1

1130 Bck s-1

50% of the farm available for users and DST production

Enough power is left for handling further improvements

in the luminosity.

Y2002 Data Reconstruction

Page 20: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

20

Day by day 2002 Data Reconstruction

May, 3rd June, 4th

pb-1/day

pb-1/100Mtri

KHz

Page 21: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

21

Dafne max luminosity

(cm-2 s-1)

Maximum reconstructable

luminosity

(pb-1/day)

Data volume

(Gbytes/pb-1)

Raw volume

(GB/pb-1)

Filtered

volume

(GB/pb-1)

Reconstructed volume

(GB/pb-1)

Bhabhas volume

(GB/pb-1)

4 1031 6 300 130 60 45 65

Data taking conditions May 2002

From 56 pb-1 of integrated luminosity : 1.9 pb-1/100Mtri (17 Tbytes on tape)

Total capacity of the Library, inclusive of the Y2001 data set, and assuming the present Dane running conditions, is 500 pb-1 corresponding to 120 full days of data taking, and if we scratch the filtered sample, is 650 pb-1 corresponding to 180 full days of data taking.

Y2000-Y2001 data fill about 105 Tbytes, e.g. 2650 cassettes, corresponding to 54% of the Tape Library storage space.

Page 22: 1 Status Report on KLOE Data Reconstruction June 2002 Offline farm Data taking in Y2000-1 Data Reconstruction Reprocessing in Y2002 DST production and

22

SUMMARY

Disk space :

Multi-user access to DSTs requires additional disk space by the end of the year 6 Tbytes + MonteCarlo DSTs ( few Tbytes) expected by the end of the year.1.2 Tbytes available now.

Library :

It will be full by the end of the year. As soon as they will be available on the market (Summer 02), and according to the data taking status we expect to update the tape drives to increase the total capacity by 50% New storage solutions are under study for the year 2003 data taking.

Computing Power :

Less than 50% of the approved total computing power has been acquired so far.It is enough for the online reconstruction, DST production, to generate MonteCarlo samples and to perform the ongoing analyses It is marginal for any reprocessing campaign, as the 4-full-months needed to reconstruct 180 pb-1 demonstrate.