26
MONARC : results and open issues MONARC : results and open issues Laura Perini Milano

lundmon.ppt

Embed Size (px)

DESCRIPTION

 

Citation preview

  • 1. MONARC : results and open issues Laura Perini Milano

2. Layout of the talk

  • Most material from Irwin Gaines talk at Chep2000
    • The basic goals and structure of the project
    • The Regional Centers
      • Motivation
      • Characteristics
      • Functions
    • Same Results from the simulations
    • The need for more realistic implementation oriented Models: Phase-3
    • Relations with GRID
  • Status of the project: Phase-3 LOI presented in January, Phase-2 Final Report to be published next week, Milestones and basic goals met

3. MONARC

    • A joint project (LHC experiments and CERN/IT) to understand issues associated with distributed data access and analysis for the LHC
    • Examine distributed data plans of current and near future experiments
    • Determine characteristics and requirements for LHC regional centers
    • Understand details of analysis process and data access needs for LHC data
    • Measure critical parameters characterizing distributed architectures, especially database and network issues
    • Createmodeling and simulation tools
    • Simulate a variety of models to understand constraints on architectures

4. MONARC

  • M odelsO fN etworkedA nalysisAtR egionalC enters
  • Caltech, CERN, FNAL, Heidelberg, INFN,
  • Helsinki, KEK, Lyon, Marseilles, Munich,Orsay, Oxford, RAL,Tufts, ...
  • GOALS
  • Specify the main parameters characterizing the Models performance: throughputs, latencies
  • Determine classes of Computing Models feasible for LHC (matched to network capacity and data handling resources)
  • Develop Baseline Models in the feasible category
  • Verify resource requirement baselines:(computing, data handling, networks)
  • COROLLARIES:
  • Define theAnalysis Process
  • DefineRegional CenterArchitectures
  • ProvideGuidelines for the final Models

622 Mbits/s 622 Mbits/s Desk tops CERN n.10 7MIPS m Pbyte Robot University n.10 6 MIPS m Tbyte Robot FNAL 4.10 7MIPS 110 Tbyte Robot 622 Mbits/s N x 622 Mbits/s Optional Air Freight 622Mbits/s 622 Mbits/s Desk tops Desk tops 5. Working Groups

  • Architecture WG
    • Baseline architecture for regional centres, Technology tracking, Survey of computing model of current HENP experiments
  • Analysis Model WG
    • Evaluation of LHC data analysis model and use cases
  • Simulation WG
    • Develop a simulation tool set for performance evaluation of the computing models
  • Testbed WG
    • Evaluate the performance of ODBMS, network in the distributed environment

6. General Need for distributed data access and analysis:

  • Potential problems of a single centralized computing center include:
  • - scale of LHC experiments: difficulty of accumulating and managing all resources at one location
  • - geographic spread of LHC experiments: providing equivalent location independent access to data for physicists
  • - help desk, support and consulting in same time zone
  • - cost of LHC experiments: optimizing use of resources located world wide

7. Motivations for Regional Centers

  • A distributed computing architecture based on regional centers offers:
    • A way of utilizing the expertise and resources residing in computing centers all over the world
    • Provide local consulting and support
    • To maximize the intellectual contribution of physicists all over the world without requiring their physical presence at CERN
    • Acknowledgement of possible limitations of network bandwidth
    • Allows people to make choices on how they analyze data based on availability or proximity of various resources such as CPU, data, or network bandwidth .

8. Future Experiment Survey

  • Analysis/Results
    • From the previous survey, we saw many sites contributed to Monte Carlo generation
      • This is now the norm
    • New experiments trying to use the Regional Center concept
      • BaBar has Regional Centers at IN2P3 and RAL, a smaller one in Rome
      • STAR has Regional Center at LBL/NERSC
      • CDF and D0 offsite institutionspaying more attention as run gets closer.

9. Future Experiment Survey

  • Other observations/ requirements
    • In the last survey, we pointed out the following requirements for RCs:
      • 24X7 support
      • software development team
      • diverse body of users
      • good, clear documentation of all s/w and s/w tools
    • The following are requirements for the central site (I.e. CERN)
      • Central code repository easy to use and easily accessible for remote sites
      • be sensitive to remote sites in database handling, raw data handling and machine flavors
      • provide good, clear documentation of all s/w and s/w tools
    • The experiments in this survey achieving the most in distributed computing are following these guidelines

10.

  • Tier0: CERN
  • Tier1: NationalRegional Center
  • Tier2: RegionalCenter
  • Tier3: InstituteWorkgroup Server
  • Tier4: IndividualDesktop
  • Total 5 Levels

11. 250 Gbps 0.8 Gbps 8 Gbps 1400 boxes 160 clusters 40 sub-farms 12 Gbps* 480 Gbps* 3 Gbps* 1.5 Gbps 100 drives 12 Gbps 5400 disks 340 arrays ... LAN-SAN routers LAN-WAN routers CMS Offline Farm at CERN circa 2006 lmr for Monarc study- april 1999 tapes 0.8 Gbps (daq) 0.8 Gbps 5 Gbps disks processors storage network storage network farm network * assumes all disk & tape traffic on storage network double these numbers if all disk & tape traffic through LAN-SAN router CERN 12. 13. Processor cluster basic box four 100 SI95 processors standard network connection (~2 Gbps) 15% of systems configured as I/O servers (disk server, disk-tape mover, Objy AMS, ..) with additional connection to the storage network cluster 9 basic boxes with a network switch (AOD AOD-->DPD Scheduled Physics groups IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Desktops Tier 2 Localinstitutes CERN Tapes Support Services 16. Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Networkfrom CERN Network from Tier 2 and simulation centers Physics SoftwareDevelopment R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Production Reconstruction Raw/Sim-->ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD-->AOD AOD-->DPD Scheduled Physics groups IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Desktops Tier 2 Localinstitutes CERN Tapes Data Input Rate from CERN: Raw Data - 5%50TB/yr ESD Data - 50%50TB/yr AOD Data - All10TB/yr Revised ESD -20TB/yrData Input from Tier 2: Revised ESD and AOD - 10TB/yr Data Input from Simulation Centers: Raw Data - 100TB/yr Data Output Rate to CERN: AOD Data -8 TB/yr Recalculated ESD -10 TB/yrSimulation ESD data -10 TB/yr Data Output to Tier 2: Revised ESD and AOD - 15 TB/yr Data Output to local institutes: ESD, AOD, DPD data -20TB/yr Total Storage:Robotic Mass Storage - 300TB Raw Data: 50TB5*10**7 events (5% of 1 year) Raw (Simulated) Data: 100TB 10**8 events EDS (Reconstructed Data): 100TB - 10**9 events (50% of 2 years) AOD (Physics Object) Data: 20TB 2*10**9 events (100% of 2 years) Tag Data: 2TB (all) Calibration/Conditions data base: 10TB (only latest version of most data types kept here) Central Disk Cache - 100TB (per user demand) CPU Required for AMS database servers: ??*10**3 SI95 power 17. Physics SftwareDevelopment R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Networkfrom CERN Network fromTier 2and simulation centers Production Reconstruction Raw/Sim-->ESD Scheduled experiment/ physics groups Production Analysis ESD-->AOD AOD-->DPD Scheduled Physics groups IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Tier 2 Localinstitutes CERN Tapes Farms of low cost commodity computers, limited I/O rate, modest local disk cache ----------------------------------------------------- Reconstruction Jobs:Reprocessing of raw data: 10**8 events/year (10%) Initial processing of simulated data: 10**8/year 1000 SI95-sec/event ==> 10**4 SI95 capacity: 100 processing nodes of 100 SI95 power Event Selection Jobs:10 physics groups * 10**8 events (10%samples) * 3 times/yr based on ESD and latest AOD data 50 SI95/evt ==> 5000 SI95 power Physics Object creation Jobs: 10 physics groups * 10**7 events (1% samples) * 8 times/yr based on selected event sample ESD data 200 SI95/event ==> 5000 SI95 power Derived Physics data creation Jobs: 10 physics groups * 10**7 events * 20 times/yr based on selected AOD samples, generates canonical derived physics data 50 SI95/evt ==> 3000 SI95 power Total 110 nodes of 100 SI95 power Derived Physics data creation Jobs: 200 physicists * 10**7 events * 20 times/yr based on selected AOD and DPD samples 20 SI95/evt ==> 30,000 SI95 power Total 300 nodes of 100 SI95 power Desktops 18. MONARC Analysis Process Example 19. Model and Simulation parameters

  • Have a new set of parameters common to all simulating groups.
  • More realistic values, but still to be discussed/agreed on the basis of Experiments information.

1000 Proc_time_RAWSI95sec/event(350) 25 Proc_Time_ESD(2.5) 5 Proc_Time_AOD(0.5) 3Analyze_Time_TAG 3Analyze_Time_AOD 15 Analyze_Time_ESD(3) 600 Analyze_Time_RAW(350) 100Memory of JobsMB 5000 Proc_Time_Create_RAWSI95sec/event(35) 1000 Proc_Time_Create_ESD(1) 25 Proc_Time_Create_AOD(1) 20. Base Model used

  • Basic Jobs
    • Reconstruction of 10 7events : RAW--> ESD --> AOD --> TAG at CERN Its the production while the data are coming from the DAQ (100 days of running collecting a billion of events per year)
    • Analysis of 5 Working Groups each of 25 analyzers on TAG only (no request to higher level data samples).Every analyzer submit 4 sequential jobs on 10 6events. Each analyzer work start-time is a flat random choice in the range of 3000 seconds. Each analyzer data sample of 10 6events is a random choice in the complete data sample of TAG DataBase consisting of 10 7events.
    • Transfer (FTP) of a 10 7events ESD, AOD and TAG from CERN to RC
  • CERN Activities : Reconstruction, 5 WG Analysis, FTP transfer
  • RC Activities : 5 (uncorrelated) WG Analysis, receive FTP transfer
  • Jobs paper estimate:
    • Single Analysis Job : 1.67 CPU hours at CERN = 6000 sec at CERN (same at RC)
    • Reconstruction at CERN for 1/500 RAW to ESD : 3.89 CPU hours = 14000 sec
    • Reconstruction at CERN for 1/500 ESD to AOD : 0.03 CPU hours = 100 sec

21. Resources: LAN speeds ?!

  • In our Models the DB Servers are uncorrelated and thus one activity uses a single Server.The bottlenecks are theread and writespeed to and from the Server. In order to use the CPU power at reasonable percentage we need a read speed of at least300 MB/sand a write speed of100 MB/s(milestone already met today)
  • We use100 MB/sin current simulations (10 Gbits/sec switched LANs in 2005 may be possible).
  • Processing node link speed is negligible in our simulations.
  • Of course the real implementation of the Farms can be different, but the results of the simulation do not depend on real implementation: they are based on usable resources.

Seefollowingslides 22. More realistic values for CERN and RC

  • Data Link speeds at100 MB/sec(all values) except :
    • Node_Link_Speed at 10 MB/sec
    • WAN Link speeds at 40 MB/sec
  • CERN
    • 1000 Processing nodes each of 500 SI95
  • RC
    • 200 Processing nodes each of 500 SI95

1000 Processing nodes times 500SI95 = 500kSI95about the CPU power of CERN Tier0 disk space as for the number of DBs 100kSI95 processing Power = 20% CERN disk space as for the number of DBs 23. Overall Conclusions

  • MONARC simulation tools are:
    • sophisticated enough to allow modeling of complex distributed analysis scenarios
    • simple enough to be used by non experts
  • Initial modeling runs are alkready showing interesting results
  • Future work will help identify bottlenecks and understand constraints on architectures

24. MONARC Phase 3

  • More Realistic Computing Model Development
  • Confrontation of Models with Realistic Prototypes;
  • At Every Stage:Assess Use Cases Based on Actual Simulation, Reconstruction and Physics Analyses;
    • Participate in the setup of the prototyopes
    • We will further validate and develop MONARC simulation system using the results of these use cases (positive feedback)
    • Continue to Review Key Inputs to the Model
      • CPU Times at Various Phases
      • Data Rate to Storage
      • Tape Storage: Speed and I/O
  • Employ MONARC simulation and testbeds to study CM variations,and suggest strategy improvements

25. MONARC Phase 3

  • Technology Studies
    • Data Model
      • Data structures
      • Reclustering, Restructuring; transport operations
      • Replication
      • Caching, migration (HMSM), etc.
    • Network
      • QoS Mechanisms: Identify Which are important
    • Distributed System Resource Management and Query Estimators
      • (Queue management and Load Balancing)
  • Development of MONARC Simulation Visualization Tools for interactive Computing Model analysis

26. Relation to GRID

  • The GRID project is great!
    • Development of s/w toolsneededfor implementing realistic LHC Computing Models
      • farm management, WAN resource and data management, etc.
    • Help in getting funds for real life testbed systems (RC prototypes)
  • Complementarity GRID-MONARC hierarchical RC Model
    • Hierarchy of RC is asafeoption. If GRID will bring big advancements, less hierarchical models should alo become possible
  • Timings well matched
    • MONARC Phase-3 to last~1 year: bridge to GRID project starting early in 2001
    • Afterwards common work by LHC experiments for developping the computing models will surely be still needed:in which project framework and for how long we will see then...