12
Jan. 17, 2002 DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Proposal for a DØ Remote Analysis Remote Analysis Model (DØRAM) Model (DØRAM) Introduction Introduction Remote Analysis Station Architecture Remote Analysis Station Architecture Requirement for Regional Analysis Requirement for Regional Analysis Centers Centers Suggested Storage Equipment Design Suggested Storage Equipment Design What Do I Think We Should Do? What Do I Think We Should Do? Conclusions Conclusions DØRACE Workshop DØRACE Workshop Feb. 12, 2002 Feb. 12, 2002 Jae Yu Jae Yu

Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Embed Size (px)

Citation preview

Page 1: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

1

Proposal for a DØ Remote Proposal for a DØ Remote Analysis Model (DØRAM)Analysis Model (DØRAM)

• Introduction Introduction • Remote Analysis Station ArchitectureRemote Analysis Station Architecture• Requirement for Regional Analysis CentersRequirement for Regional Analysis Centers• Suggested Storage Equipment DesignSuggested Storage Equipment Design• What Do I Think We Should Do?What Do I Think We Should Do?• ConclusionsConclusions

DØRACE WorkshopDØRACE WorkshopFeb. 12, 2002Feb. 12, 2002

Jae YuJae Yu

Page 2: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

2

Why do we need a DØRAM?• Total Run IIa data sizes are

– 350TB for RAW– 200-400 TB for Reco + root– 1.4x109 Events total

• At the fully optimized 10sec/event reco.1.4x1010 Seconds for one time reprocessing

• Takes one full year w/ 500 machines– Takes ~8mos to transfer raw data for dedicated gigabit network

• Centralized system will do a lot of good but not sufficient (DØ analysis model proposal should be complemented with DØRAM)

• Need to allow remote locations to work on analysis efficiently• Sociological benefits within the institutes• Regional Analysis Centers should be established

Page 3: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

3

• Categorized remote analysis system set up by the functionality– Desk top only– A modest analysis server– Linux installation– UPS/UPD Installation and deployment– External package installation via UPS/UPD

• CERNLIB• Kai-lib• Root

– Download and Install a DØ release• Tar-ball for ease of initial set up?• Use of existing utilities for latest release download

– Installation of cvs – Code development– KAI C++ compiler– SAM station setup

DØRACE Strategy

Phase IRootupleAnalysis

Phase 0Preparation

Phase IIExecutables

Phase IIICode Dev.

Phase IVData Delivery

Page 4: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

4

DØRACE Status by Setup Phases

17

39

0 2

13

0

17

34

4 47 5

17

33

0 16

14

05

1015202530354045

NoInterest

Phase 0 Phase I Phase II PhaseIII

PhaseIV

Phases

Nu

mb

er o

f In

stit

uti

on

s

Nov. Survey

Before 2/11

After 2/11

Progressive

Page 5: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

5

Central Analysis Center (CAC)

DesktopAnalysis Stations DAS DAS…. DAS DAS….

….

RAC RAC ….RegionalAnalysis Centers

Store & Process10~20%of All Data

IAC….

InstitutionalAnalysis Centers IAC IAC

….IAC

Normal InteractionCommunication Path

Occasional Interaction Communication Path

Proposed DØRAM Architecture

Page 6: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

6

Regional Analysis Centers• A few geographically selected sites that satisfy

requirements• Provide almost the same level of service as FNAL to a

few institutional analysis centers• Analyses carried out within the regional center

– Store 10~20% of statistically random data permanently – Most the analyses performed on these samples with the

regional network– Refine the analyses using the smaller but unbiased data set– When the entire data set is needed Underlying Grid

architecture provide access to remaining data set

Page 7: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

7

Regional Analysis Center Requirements• Become a Mini-CAC• Sufficient computing infrastructure

– Large bandwidth (gagibit or better)– Sufficient Storage Space to hold 10~20% of data permanently and

expandable to accommodate data increase• >30TB just for Run IIa RAW data

– Sufficient CPU resources to provide regional or Institutional analysis requests and reprocessing

• Geographically located to avoid unnecessary network traffic overlap• Software Distribution and Support

– Mirror copy of CVS database for synchronized update between RAC’s and CAC

– Keep the relevant copies of data bases– Act as SAM service station

Page 8: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

8

Regional Storage Cache

Disk Server

.

.

.

IDE-RAID

IDE-RAID

IDE-RAID

IDE-RAID

Gbit Switch•IDE Hard drives are $1.5~$2./Gb•Each IDE RAID array gives up to ~1TByte – hot swappable•Can be configured to have up to 10TB in a rack•Modest server can manage the entire system•Gbit network switch provides high throughput transfer to outside world•Flexible and scalable system•Need an efficient monitoring and error recovery system•Communication to resource management

Page 9: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

9

What Do I Think We Should Do?• Most the students and postDocs are at FNAL, thus it is important to

provide them sufficient computing and cache resources for their analysis. The Current suggestion for backend analysis clusters should be built!!

• In the mean time, we should select a few sites as RACs and prepare sufficient hardware and infrastructure– My rough map scan gives FNAL+3RACs in the US, and a few in Europe

• Software effort for Grid should proceed as fast as we can to supplement the hardware – We cannot afford to spend time for Test beds– Our set ups should be the Test Bed and the actual Grid

• A working group to determine number of RAC sites, their requirements, and select RACs within the next couple of months.

Page 10: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

10

Suggestions and Comments from The Working Group

• Data characteristics – Specialized data set, in addition to service data set for

reprocessing– Some level of replication should be allowed– Consistency of data must be ensured – Centralized organization of reprocessing– Book keeping of reprocessing

• Two staged approach:– Before Full gridification All data kept in the CAC– After full gridification

• Fully distributed within the network• Data sets are mutually exclusive

Page 11: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

11

• In Europe, some institutions are already in the works to become an RAC– Karlsruhe (Germany)– NIKHEF (Netherlands)– IN2P3, Lyon (France)

• We want more US participation• Agreed to form a group to formulate RAC more

systematically Write up a document within 1-2 mos.– Functions– Services– Requirements– Etc.

Page 12: Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote

Jan. 17, 2002 DØRAM ProposalDØRACE Meeting, Jae Yu

12

Conclusions

• DØ must prepare for large data set era• Need to expedite analyses in timely fashion• Need to distribute data set throughout the

collaboration• Establishing regional analysis centers will be the

first step toward DØ Grid• Will write up a proposal for your perusal