19
Offline/MC status report M. Moulson 6th KLOE Physics Workshop Sabaudia, 10-13 May 2006

Offline/MC status report M. Moulson 6th KLOE Physics Workshop Sabaudia, 10-13 May 2006

Embed Size (px)

Citation preview

Offline/MC status report

M. Moulson

6th KLOE Physics Workshop

Sabaudia, 10-13 May 2006

Offline/MC tasks for 2006

Define definitive KLOE data set:

Close holes in data reconstruction and DST coverageDST file size problemMake sure all reconstructed runs are complete

Reprocessing of various data setsMost critical is 2004 data with bad wire maps

Data qualityCompleteness of HepDB entries for MC productionFine survey for analysis purposes

MC production for 2004-2005

Re-reconstruction of 2001/2002 MC sample?Requested by K group - problem of bad wire maps

L dt (pb–1)

tag 100

2001 169

2002 292

2004 691

2005 1242

total 2394

Reconstruction of 2004-2005 data

0 500 1000 1500 2000

drn/drc

d3p

dk0

dkc

recon

tag 100

raw

1930

1740

1970

990

1480

1520

1480

pb–1

Runs 28700 (9 May 04) to 41902 (5 Dec 05)Updated to reflect DB work 21 Mar 06

DST size problem

Stream DST MCDST

dkc/mkc 210 200

dk0/mk0 290 290

d3p/m3p 2700 780

drn/mrn 6800 14600

drc/mrc 730 2800

32-bit I/O pointers in Fortran: Maximum file size = 2 GBThere may be a workaround (esp. for reading w/ KID) but not easy!

What does 2 GB limit mean? Big headache!

Max sizes difficult to calculate:Fluctuating background components

• 10% for most streams• 33% for dkc/mkc

MC numbers assume:• Full cross section for stream• Size refers to scaled luminosity

Need to split DSTs into pieces100 nb-1 chunksMust modify scripts

• Working on DST standalone script to plug holes

Max size (nb-1)

Reconstructed KLOE data

DBV 2001 2002 2004 2005 Scan Comment

12-15 163 34Bad cut in FILFONo bias sample for FILFONo bias sample for rad

19-21 645bad

wires!

217Various significant ECL modsMinor changes to reconNo bias sample for rad

22 222 395Stable ECLOld bias sample for rad

23-24 12 502 61Stable ECL for s = m

ufo as bias sample

25 16 205Stable ECL for off-peakufo as bias sample

analyzed 163 269 645 1114 266

total 169 292 691 1242 284

trgmon luminosity in pb-1 by year and DBV

645 pb-1 to be reprocessed for sure, 278 pb-1 (617 pb-1) as time permits

Notes on reprocessingReprocessing most critical for 2004 data with bad wire maps

Reprocess 2004 data firstOther reprocessing priorities can be decided upon afterwards

When data are reprocessed, old reconstructed files will be deleted:If data have already been used for analysis (e.g. 2001-2002 data):

– files deleted, database record and old DSTs will be kept.If data have not been used for analysis (e.g. 2004-2005 data):

– database record and DSTs deleted, as if run had never been reconstructed in the first place.

Basically same treatment planned for incomplete runs in data set

Off-peak data reconstructed w/ DBV-24 need reprocessing for use

61 pb-1: all scan data, small amount of s = 1000 MeV dataKeep current files until ready to start reprocessing

Plan: Start 2004 reprocessing and 2005 MC production in parallelIf there are problems with DH-induced latency, hold reprocessing until additional disk space for the I/O cache arrives (it is expected in June).

MC development itemsNew IR geometry CB

Revision of constants in GEANFI CB

Generators: ee,ee0, KSee CG RV ADS

Nuclear interactions/regeneration CB

Adjustments to EmC energy response PG

Adjustments to EmC time resolution CG

Simulation of EmC cluster efficiency MP TS

SQZ (compression) fix MM CB

Fix CELE/CSPS/cluster banks/structures MM MP

Data-quality parameters (√s, etc) GV

Trigger params.: quality, DC thresh, DISH maps MP BS

lsb background insertion MM MP

Correlated noise for charged kaons EDL PDS

dE/dx simulation VP

Check ECL code: ee0 SG

Split large (> 2GB) DSTs (MCDSTs) MM

MC sign-off: Data quality & DB issuesData-quality parameters & DB loading

Not to be confused with fine survey for analysis (S. Fiore)• 2004 run parameters (s, p, etc.) already loaded into HepDB

• 2005 run parameters obtained, need to be loaded• Trigger parameters in DB2 for 2004 & 2005• Cluster efficiency, time resolution parameters updated• Dead/hot wire maps, DC efficiencies: need to unify tables

HepDB-to-DB2 migration?

F. Sborzacchi has developed:• DB2 tables to contain data currently in HepDB:

All detector calibration dataMuch run-condition information (s, p, etc.)

• Code to fill and maintain new tables• Interface code (drop-in replacements for HepDB calls)

Ready to go (but wait until MC started?)

MC sign-off: LSB backgroundWork on LSB background to deal with:• Inconsistent cluster/KINE matching• Timescale alignment for MC and data• Treatment of “noise” hits (missing tA or tB)

Diagnose performance:t0

rec t0MC distribution

KS , KL crash events

KS , KL crasht0

true from

t0rec t0

true (ns)

linear scaleKS 00 from MC, t0

rec from T0_FIND

t0 fromMC track

t0 fromLSB cluster

mixed t0

t0rec t0

MC (ns)

– data• MC

MC sign-off: LSB background

Presenter

Accidental rateReconstr. LSB files

7 MeV in window

E 3.6% 3.1%

B 1.8% 1.7%

W 2.8% 2.7%

LSB files look OK!• Overall prob. for LSB

cluster to set t0 = 0.82(4)%

• Roughly half will steal t0 in event

Frac. events w/ stolen t0 in: Data MC OR MC AND

From tail in KS , KL crash events 0.5% 0.7% 1.1%

From MC truth (t0 cluster acc. or mixed) 0.9% 1.4%

• Harder filtering of noise hits increases stolen t0 probAre we dropping LSB events w/ no clusters?

• Before starting production, must check DC timescale alignment!

Compare to data and reconstructed MC (all_phys test runs):

New A/C module (SIMKBCK) adds background hits correlated to K tracks

Sample distribution of K-correlated background hits in:– layer, distance (in wires) from reconstructed track, time

Private reconstruction of ~20 pb-1 of 2002 MC events to study how new background parameterization affects MC tracking efficiency

+ (MC)+ (data)

trk+vtx as a function of t*(K)

Note: Large correction (5) for absolute probability of background hits:Background measured using reconstructed tracksDon’t account for tracks not reconstructed because of background

MC sign-off: K background

t*(K) (ns) t*(K) (ns)

before after

MC sign-off: dE/dx simulationCalibration of A/C module (DIGIDCADC) to simulate dE/dx response of DC

• E distribution rescaled and smeared to match data• Different s-t relations for data/MC effective integration gates different

General note:• Difficulties in calibrating for space-charge effects undermine resolution• dE/dx resolution adequate for K ID, but not e.g. Ke2/K2 separation

dE/dx (count/cm)

K2 sample

K2 sample

dE/dx (count/cm)

dE/dx distribution Truncated mean distribution

K2 sample

K2 sample

+ data– MC

Monte Carlo tests in 2006

Dates L (pb-1) Runs Comments

20/2-13/3 10539000-39600

• Test new cluster efficiency parameters• Test new EmC time resolution parameters• Additional EmC studies on energy scale calibration• Test lsb cluster background• Complications from bugs in MC truth variables for EmC Fortran structures

• DH problems complicate CPU/runtime analysis

3/5-4/5 1939601-39750

• Reconstructed with DBV-26• Correct EmC structures• Test SIMKBCK module

4/5-5/5 2939751-40000

• Fix EmC timescale for noise hits

7/5-8/5 4240001-40250

• Attempt to optimize EmC background

All tests based on all_phys card, LSF = 0.2

Monte Carlo production plans

2001-2002

450 pb-1

2004-2005

2000 pb-1

1.85G evts

8800 B80 days

8.25G evts

39000 B80 days

Averaged over entire MC sample: 0.21M evts/B80 day = 2.4 Hz0.41 s/evt (simulation + reconstruction + DST)

all, scale = 0.2KSKL, scale = 1

KK, scale = 1 radiative, scale = 5Other (1M evts/pb-1)

Total: 3.1M evts/pb-1

(about same as number of decays in data)

2001-2002 MC production Estimated time for 2004-2005 MC

all, KSKL, KK campaigns to be combined: all_phys, LSF = 1.2

Start with 2005 data: Reprocessing not necessary for comparison

Offline resources: CPU

Nodes Type CPUsms/ev

MC

ms/ev

Rec

B80

“MC”

B80

“CW”

fibm14-15,17-34 P3 375 MHz 80 211 192 80 80

fibm35-44 P4+ 1.45 GHz 40 98 85 88 96

fibm45-47 P5 1.5 GHz SMT “96” 105 202 126 184

All 294 360

• 1 “B80 CPU” = 1 P3 CPU installed in B80 server - KLOE standard unit

• Accuracy of estimates depends on B80/P4+ and B80/P5 ratios

• “MC” results based on MC tests (all_phys, ‘05 data) 3-8 May 06:87 CPU (132 B80 “MC”) configuration - no competing offline workNo serious DH problems observed

• Conventional wisdom (“CW”) based on past experience:1 P4+ = 2.4 B80 (compare 2.2 above)1 P5 = 0.8 P4+ = 1.9 B80 (compare 1.3 above)

Offline CPU needs

2006 offline projects B80 days Days/214 B80

Reprocessing of 2004 data 16000 75

MC production for 2004-2005 data 39000 180

All reprocessing for ’02-’04-’05 data (900 pb-1)

22000 100

Minimum total 55000 260

Maximum total 77000 360

9-12 months if there are no serious DH problems

Assuming: 294 B80 total (MC test, not CW)80 B80 left to users for analysis

214 B80for offline

Offline resources: Disk space

DSTs cached on nfs-mounted disks for fast analysis access

Current Final

35 TB data

10 TB MC

45 TB total

42 TB data

44 TB MC

86 TB total

DST volume*

Current DST cache capacity: 12 TB

New purchases: 21 TB FC disk + new controller

Gara closed yesterday

Delivery by end of June?

~30 TB available for disk cache

*Updated to account for scan data

Offline resources: Tape space

Type Current (TB)

Temporary allocation (TB)

Final allocation (TB)

raw 248.7 250 250

rec 181.2 250 250

DST 34.7 45 45

MC 61.2 150 350

MC DST 10.0 25 60

Total 535.8 720 955

1. Allocations include currently occupied space

2. MC DSTs probably appear as datarec files to archiver

3. Current library system capacity ~720 GBNew cassettes will have to be ordered in future

4. Temporary allocation based on 720 GB libraryAssumes MC production slow

5. Final allocation assumes completion of KLOE offline program

Outlook and summary

• Starting MC production a very high priority, but we need to make a few last checks

• We are planning for a near simultaneous start of– MC production, all_phys, LSF = 1.2, 2005 data– Reprocessing 2004 data with bad wire maps

• We have the CPU power needed to generate a definitive MC sample and reprocess as necessary on a time scale compatible with beginning of 2007

• We will probably want to expand DST disk cache

• We will need to order new cassettes towards the end of the year