lundmon.ppt

1. MONARC : results and open issues Laura Perini Milano

2. Layout of the talk

Most material from Irwin Gaines talk at Chep2000

The basic goals and structure of the project

The Regional Centers

Motivation

Characteristics

Functions

Same Results from the simulations

The need for more realistic implementation oriented Models: Phase-3

Relations with GRID

Status of the project: Phase-3 LOI presented in January, Phase-2 Final Report to be published next week, Milestones and basic goals met

3. MONARC

A joint project (LHC experiments and CERN/IT) to understand issues associated with distributed data access and analysis for the LHC

Examine distributed data plans of current and near future experiments

Determine characteristics and requirements for LHC regional centers

Understand details of analysis process and data access needs for LHC data

Measure critical parameters characterizing distributed architectures, especially database and network issues

Createmodeling and simulation tools

Simulate a variety of models to understand constraints on architectures

4. MONARC

M odelsO fN etworkedA nalysisAtR egionalC enters

Caltech, CERN, FNAL, Heidelberg, INFN,

Helsinki, KEK, Lyon, Marseilles, Munich,Orsay, Oxford, RAL,Tufts, ...

Specify the main parameters characterizing the Models performance: throughputs, latencies

Determine classes of Computing Models feasible for LHC (matched to network capacity and data handling resources)

Develop Baseline Models in the feasible category

Verify resource requirement baselines:(computing, data handling, networks)

COROLLARIES:

Define theAnalysis Process

DefineRegional CenterArchitectures

ProvideGuidelines for the final Models

622 Mbits/s 622 Mbits/s Desk tops CERN n.10 7MIPS m Pbyte Robot University n.10 6 MIPS m Tbyte Robot FNAL 4.10 7MIPS 110 Tbyte Robot 622 Mbits/s N x 622 Mbits/s Optional Air Freight 622Mbits/s 622 Mbits/s Desk tops Desk tops 5. Working Groups

Architecture WG

Baseline architecture for regional centres, Technology tracking, Survey of computing model of current HENP experiments

Analysis Model WG

Evaluation of LHC data analysis model and use cases

Simulation WG

Develop a simulation tool set for performance evaluation of the computing models

Testbed WG

Evaluate the performance of ODBMS, network in the distributed environment

6. General Need for distributed data access and analysis:

Potential problems of a single centralized computing center include:

- scale of LHC experiments: difficulty of accumulating and managing all resources at one location

- geographic spread of LHC experiments: providing equivalent location independent access to data for physicists

- help desk, support and consulting in same time zone

- cost of LHC experiments: optimizing use of resources located world wide

7. Motivations for Regional Centers

A distributed computing architecture based on regional centers offers:

A way of utilizing the expertise and resources residing in computing centers all over the world

Provide local consulting and support

To maximize the intellectual contribution of physicists all over the world without requiring their physical presence at CERN

Acknowledgement of possible limitations of network bandwidth

Allows people to make choices on how they analyze data based on availability or proximity of various resources such as CPU, data, or network bandwidth .

8. Future Experiment Survey

Analysis/Results

From the previous survey, we saw many sites contributed to Monte Carlo generation

This is now the norm

New experiments trying to use the Regional Center concept

BaBar has Regional Centers at IN2P3 and RAL, a smaller one in Rome

STAR has Regional Center at LBL/NERSC

CDF and D0 offsite institutionspaying more attention as run gets closer.

9. Future Experiment Survey

Other observations/ requirements

In the last survey, we pointed out the following requirements for RCs:

24X7 support

software development team

diverse body of users

good, clear documentation of all s/w and s/w tools

The following are requirements for the central site (I.e. CERN)

Central code repository easy to use and easily accessible for remote sites

be sensitive to remote sites in database handling, raw data handling and machine flavors

provide good, clear documentation of all s/w and s/w tools

The experiments in this survey achieving the most in distributed computing are following these guidelines

Tier0: CERN

Tier1: NationalRegional Center

Tier2: RegionalCenter

Tier3: InstituteWorkgroup Server

Tier4: IndividualDesktop

Total 5 Levels

11. 250 Gbps 0.8 Gbps 8 Gbps 1400 boxes 160 clusters 40 sub-farms 12 Gbps* 480 Gbps* 3 Gbps* 1.5 Gbps 100 drives 12 Gbps 5400 disks 340 arrays ... LAN-SAN routers LAN-WAN routers CMS Offline Farm at CERN circa 2006 lmr for Monarc study- april 1999 tapes 0.8 Gbps (daq) 0.8 Gbps 5 Gbps disks processors storage network storage network farm network * assumes all disk & tape traffic on storage network double these numbers if all disk & tape traffic through LAN-SAN router CERN 12. 13. Processor cluster basic box four 100 SI95 processors standard network connection (~2 Gbps) 15% of systems configured as I/O servers (disk server, disk-tape mover, Objy AMS, ..) with additional connection to the storage network cluster 9 basic boxes with a network switch (AOD AOD-->DPD Scheduled Physics groups IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Desktops Tier 2 Localinstitutes CERN Tapes Support Services 16. Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Networkfrom CERN Network from Tier 2 and simulation centers Physics SoftwareDevelopment R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Production Reconstruction Raw/Sim-->ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD-->AOD AOD-->DPD Scheduled Physics groups IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Desktops Tier 2 Localinstitutes CERN Tapes Data Input Rate from CERN: Raw Data - 5%50TB/yr ESD Data - 50%50TB/yr AOD Data - All10TB/yr Revised ESD -20TB/yrData Input from Tier 2: Revised ESD and AOD - 10TB/yr Data Input from Simulation Centers: Raw Data - 100TB/yr Data Output Rate to CERN: AOD Data -8 TB/yr Recalculated ESD -10 TB/yrSimulation ESD data -10 TB/yr Data Output to Tier 2: Revised ESD and AOD - 15 TB/yr Data Output to local institutes: ESD, AOD, DPD data -20TB/yr Total Storage:Robotic Mass Storage - 300TB Raw Data: 50TB5*10**7 events (5% of 1 year) Raw (Simulated) Data: 100TB 10**8 events EDS (Reconstructed Data): 100TB - 10**9 events (50% of 2 years) AOD (Physics Object) Data: 20TB 2*10**9 events (100% of 2 years) Tag Data: 2TB (all) Calibration/Conditions data base: 10TB (only latest version of most data types kept here) Central Disk Cache - 100TB (per user demand) CPU Required for AMS database servers: ??*10**3 SI95 power 17. Physics SftwareDevelopment R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Networkfrom CERN Network fromTier 2and simulation centers Production Reconstruction Raw/Sim-->ESD Scheduled experiment/ physics groups Production Analysis ESD-->AOD AOD-->DPD Scheduled Physics groups IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Tier 2 Localinstitutes CERN Tapes Farms of low cost commodity computers, limited I/O rate, modest local disk cache ----------------------------------------------------- Reconstruction Jobs:Reprocessing of raw data: 10**8 events/year (10%) Initial processing of simulated data: 10**8/year 1000 SI95-sec/event ==> 10**4 SI95 capacity: 100 processing nodes of 100 SI95 power Event Selection Jobs:10 physics groups * 10**8 events (10%samples) * 3 times/yr based on ESD and latest AOD data 50 SI95/evt ==> 5000 SI95 power Physics Object creation Jobs: 10 physics groups * 10**7 events (1% samples) * 8 times/yr based on selected event sample ESD data 200 SI95/event ==> 5000 SI95 power Derived Physics data creation Jobs: 10 physics groups * 10**7 events * 20 times/yr based on selected AOD samples, generates canonical derived physics data 50 SI95/evt ==> 3000 SI95 power Total 110 nodes of 100 SI95 power Derived Physics data creation Jobs: 200 physicists * 10**7 events * 20 times/yr based on selected AOD and DPD samples 20 SI95/evt ==> 30,000 SI95 power Total 300 nodes of 100 SI95 power Desktops 18. MONARC Analysis Process Example 19. Model and Simulation parameters

Have a new set of parameters common to all simulating groups.

More realistic values, but still to be discussed/agreed on the basis of Experiments information.

1000 Proc_time_RAWSI95sec/event(350) 25 Proc_Time_ESD(2.5) 5 Proc_Time_AOD(0.5) 3Analyze_Time_TAG 3Analyze_Time_AOD 15 Analyze_Time_ESD(3) 600 Analyze_Time_RAW(350) 100Memory of JobsMB 5000 Proc_Time_Create_RAWSI95sec/event(35) 1000 Proc_Time_Create_ESD(1) 25 Proc_Time_Create_AOD(1) 20. Base Model used

Basic Jobs

Reconstruction of 10 7events : RAW--> ESD --> AOD --> TAG at CERN Its the production while the data are coming from the DAQ (100 days of running collecting a billion of events per year)

Analysis of 5 Working Groups each of 25 analyzers on TAG only (no request to higher level data samples).Every analyzer submit 4 sequential jobs on 10 6events. Each analyzer work start-time is a flat random choice in the range of 3000 seconds. Each analyzer data sample of 10 6events is a random choice in the complete data sample of TAG DataBase consisting of 10 7events.

Transfer (FTP) of a 10 7events ESD, AOD and TAG from CERN to RC

CERN Activities : Reconstruction, 5 WG Analysis, FTP transfer

RC Activities : 5 (uncorrelated) WG Analysis, receive FTP transfer

Jobs paper estimate:

Single Analysis Job : 1.67 CPU hours at CERN = 6000 sec at CERN (same at RC)

Reconstruction at CERN for 1/500 RAW to ESD : 3.89 CPU hours = 14000 sec

Reconstruction at CERN for 1/500 ESD to AOD : 0.03 CPU hours = 100 sec

21. Resources: LAN speeds ?!

In our Models the DB Servers are uncorrelated and thus one activity uses a single Server.The bottlenecks are theread and writespeed to and from the Server. In order to use the CPU power at reasonable percentage we need a read speed of at least300 MB/sand a write speed of100 MB/s(milestone already met today)

We use100 MB/sin current simulations (10 Gbits/sec switched LANs in 2005 may be possible).

Processing node link speed is negligible in our simulations.

Of course the real implementation of the Farms can be different, but the results of the simulation do not depend on real implementation: they are based on usable resources.

Seefollowingslides 22. More realistic values for CERN and RC

Data Link speeds at100 MB/sec(all values) except :

Node_Link_Speed at 10 MB/sec

WAN Link speeds at 40 MB/sec

1000 Processing nodes each of 500 SI95

200 Processing nodes each of 500 SI95

1000 Processing nodes times 500SI95 = 500kSI95about the CPU power of CERN Tier0 disk space as for the number of DBs 100kSI95 processing Power = 20% CERN disk space as for the number of DBs 23. Overall Conclusions

MONARC simulation tools are:

sophisticated enough to allow modeling of complex distributed analysis scenarios

simple enough to be used by non experts

Initial modeling runs are alkready showing interesting results

Future work will help identify bottlenecks and understand constraints on architectures

24. MONARC Phase 3

More Realistic Computing Model Development

Confrontation of Models with Realistic Prototypes;

At Every Stage:Assess Use Cases Based on Actual Simulation, Reconstruction and Physics Analyses;

Participate in the setup of the prototyopes

We will further validate and develop MONARC simulation system using the results of these use cases (positive feedback)

Continue to Review Key Inputs to the Model

CPU Times at Various Phases

Data Rate to Storage

Tape Storage: Speed and I/O

Employ MONARC simulation and testbeds to study CM variations,and suggest strategy improvements

25. MONARC Phase 3

Technology Studies

Data Model

Data structures

Reclustering, Restructuring; transport operations

Replication

Caching, migration (HMSM), etc.

Network

QoS Mechanisms: Identify Which are important

Distributed System Resource Management and Query Estimators

(Queue management and Load Balancing)

Development of MONARC Simulation Visualization Tools for interactive Computing Model analysis

26. Relation to GRID

The GRID project is great!

Development of s/w toolsneededfor implementing realistic LHC Computing Models

farm management, WAN resource and data management, etc.

Help in getting funds for real life testbed systems (RC prototypes)

Complementarity GRID-MONARC hierarchical RC Model

Hierarchy of RC is asafeoption. If GRID will bring big advancements, less hierarchical models should alo become possible

Timings well matched

MONARC Phase-3 to last~1 year: bridge to GRID project starting early in 2001

Afterwards common work by LHC experiments for developping the computing models will surely be still needed:in which project framework and for how long we will see then...

Documents

lundmon.ppt