- 1. MONARC : results and open issues Laura Perini Milano
2. Layout of the talk
- Most material from Irwin Gaines talk at Chep2000
-
- The basic goals and structure of the project
-
- Same Results from the simulations
-
- The need for more realistic implementation oriented Models:
Phase-3
- Status of the project: Phase-3 LOI presented in January,
Phase-2 Final Report to be published next week, Milestones and
basic goals met
3. MONARC
-
- A joint project (LHC experiments and CERN/IT) to understand
issues associated with distributed data access and analysis for the
LHC
-
- Examine distributed data plans of current and near future
experiments
-
- Determine characteristics and requirements for LHC regional
centers
-
- Understand details of analysis process and data access needs
for LHC data
-
- Measure critical parameters characterizing distributed
architectures, especially database and network issues
-
- Createmodeling and simulation tools
-
- Simulate a variety of models to understand constraints on
architectures
4. MONARC
- M odelsO fN etworkedA nalysisAtR egionalC enters
- Caltech, CERN, FNAL, Heidelberg, INFN,
- Helsinki, KEK, Lyon, Marseilles, Munich,Orsay, Oxford,
RAL,Tufts, ...
- Specify the main parameters characterizing the Models
performance: throughputs, latencies
- Determine classes of Computing Models feasible for LHC (matched
to network capacity and data handling resources)
- Develop Baseline Models in the feasible category
- Verify resource requirement baselines:(computing, data
handling, networks)
- Define theAnalysis Process
- DefineRegional CenterArchitectures
- ProvideGuidelines for the final Models
622 Mbits/s 622 Mbits/s Desk tops CERN n.10 7MIPS m Pbyte Robot
University n.10 6 MIPS m Tbyte Robot FNAL 4.10 7MIPS 110 Tbyte
Robot 622 Mbits/s N x 622 Mbits/s Optional Air Freight 622Mbits/s
622 Mbits/s Desk tops Desk tops 5. Working Groups
-
- Baseline architecture for regional centres, Technology
tracking, Survey of computing model of current HENP
experiments
-
- Evaluation of LHC data analysis model and use cases
-
- Develop a simulation tool set for performance evaluation of the
computing models
-
- Evaluate the performance of ODBMS, network in the distributed
environment
6. General Need for distributed data access and analysis:
- Potential problems of a single centralized computing center
include:
- - scale of LHC experiments: difficulty of accumulating and
managing all resources at one location
- - geographic spread of LHC experiments: providing equivalent
location independent access to data for physicists
- - help desk, support and consulting in same time zone
- - cost of LHC experiments: optimizing use of resources located
world wide
7. Motivations for Regional Centers
- A distributed computing architecture based on regional centers
offers:
-
- A way of utilizing the expertise and resources residing in
computing centers all over the world
-
- Provide local consulting and support
-
- To maximize the intellectual contribution of physicists all
over the world without requiring their physical presence at
CERN
-
- Acknowledgement of possible limitations of network
bandwidth
-
- Allows people to make choices on how they analyze data based on
availability or proximity of various resources such as CPU, data,
or network bandwidth .
8. Future Experiment Survey
-
- From the previous survey, we saw many sites contributed to
Monte Carlo generation
-
- New experiments trying to use the Regional Center concept
-
-
- BaBar has Regional Centers at IN2P3 and RAL, a smaller one in
Rome
-
-
- STAR has Regional Center at LBL/NERSC
-
-
- CDF and D0 offsite institutionspaying more attention as run
gets closer.
9. Future Experiment Survey
- Other observations/ requirements
-
- In the last survey, we pointed out the following requirements
for RCs:
-
-
- software development team
-
-
- good, clear documentation of all s/w and s/w tools
-
- The following are requirements for the central site (I.e.
CERN)
-
-
- Central code repository easy to use and easily accessible for
remote sites
-
-
- be sensitive to remote sites in database handling, raw data
handling and machine flavors
-
-
- provide good, clear documentation of all s/w and s/w tools
-
- The experiments in this survey achieving the most in
distributed computing are following these guidelines
10.
- Tier1: NationalRegional Center
- Tier3: InstituteWorkgroup Server
11. 250 Gbps 0.8 Gbps 8 Gbps 1400 boxes 160 clusters 40
sub-farms 12 Gbps* 480 Gbps* 3 Gbps* 1.5 Gbps 100 drives 12 Gbps
5400 disks 340 arrays ... LAN-SAN routers LAN-WAN routers CMS
Offline Farm at CERN circa 2006 lmr for Monarc study- april 1999
tapes 0.8 Gbps (daq) 0.8 Gbps 5 Gbps disks processors storage
network storage network farm network * assumes all disk & tape
traffic on storage network double these numbers if all disk &
tape traffic through LAN-SAN router CERN 12. 13. Processor cluster
basic box four 100 SI95 processors standard network connection (~2
Gbps) 15% of systems configured as I/O servers (disk server,
disk-tape mover, Objy AMS, ..) with additional connection to the
storage network cluster 9 basic boxes with a network switch (AOD
AOD-->DPD Scheduled Physics groups IndividualAnalysis
AOD-->DPD and plots Chaotic Physicists Desktops Tier 2
Localinstitutes CERN Tapes Support Services 16. Data Import Data
Export Mass Storage & Disk Servers Database Servers Tapes
Networkfrom CERN Network from Tier 2 and simulation centers Physics
SoftwareDevelopment R&D Systems and Testbeds Info servers Code
servers Web Servers Telepresence Servers Training Consulting Help
Desk Production Reconstruction Raw/Sim-->ESD Scheduled,
predictable experiment/ physics groups Production Analysis
ESD-->AOD AOD-->DPD Scheduled Physics groups
IndividualAnalysis AOD-->DPD and plots Chaotic Physicists
Desktops Tier 2 Localinstitutes CERN Tapes Data Input Rate from
CERN: Raw Data - 5%50TB/yr ESD Data - 50%50TB/yr AOD Data -
All10TB/yr Revised ESD -20TB/yrData Input from Tier 2: Revised ESD
and AOD - 10TB/yr Data Input from Simulation Centers: Raw Data -
100TB/yr Data Output Rate to CERN: AOD Data -8 TB/yr Recalculated
ESD -10 TB/yrSimulation ESD data -10 TB/yr Data Output to Tier 2:
Revised ESD and AOD - 15 TB/yr Data Output to local institutes:
ESD, AOD, DPD data -20TB/yr Total Storage:Robotic Mass Storage -
300TB Raw Data: 50TB5*10**7 events (5% of 1 year) Raw (Simulated)
Data: 100TB 10**8 events EDS (Reconstructed Data): 100TB - 10**9
events (50% of 2 years) AOD (Physics Object) Data: 20TB 2*10**9
events (100% of 2 years) Tag Data: 2TB (all) Calibration/Conditions
data base: 10TB (only latest version of most data types kept here)
Central Disk Cache - 100TB (per user demand) CPU Required for AMS
database servers: ??*10**3 SI95 power 17. Physics
SftwareDevelopment R&D Systems and Testbeds Info servers Code
servers Web Servers Telepresence Servers Training Consulting Help
Desk Data Import Data Export Mass Storage & Disk Servers
Database Servers Tapes Networkfrom CERN Network fromTier 2and
simulation centers Production Reconstruction Raw/Sim-->ESD
Scheduled experiment/ physics groups Production Analysis
ESD-->AOD AOD-->DPD Scheduled Physics groups
IndividualAnalysis AOD-->DPD and plots Chaotic Physicists Tier 2
Localinstitutes CERN Tapes Farms of low cost commodity computers,
limited I/O rate, modest local disk cache
-----------------------------------------------------
Reconstruction Jobs:Reprocessing of raw data: 10**8 events/year
(10%) Initial processing of simulated data: 10**8/year 1000
SI95-sec/event ==> 10**4 SI95 capacity: 100 processing nodes of
100 SI95 power Event Selection Jobs:10 physics groups * 10**8
events (10%samples) * 3 times/yr based on ESD and latest AOD data
50 SI95/evt ==> 5000 SI95 power Physics Object creation Jobs: 10
physics groups * 10**7 events (1% samples) * 8 times/yr based on
selected event sample ESD data 200 SI95/event ==> 5000 SI95
power Derived Physics data creation Jobs: 10 physics groups * 10**7
events * 20 times/yr based on selected AOD samples, generates
canonical derived physics data 50 SI95/evt ==> 3000 SI95 power
Total 110 nodes of 100 SI95 power Derived Physics data creation
Jobs: 200 physicists * 10**7 events * 20 times/yr based on selected
AOD and DPD samples 20 SI95/evt ==> 30,000 SI95 power Total 300
nodes of 100 SI95 power Desktops 18. MONARC Analysis Process
Example 19. Model and Simulation parameters
- Have a new set of parameters common to all simulating
groups.
- More realistic values, but still to be discussed/agreed on the
basis of Experiments information.
1000 Proc_time_RAWSI95sec/event(350) 25 Proc_Time_ESD(2.5) 5
Proc_Time_AOD(0.5) 3Analyze_Time_TAG 3Analyze_Time_AOD 15
Analyze_Time_ESD(3) 600 Analyze_Time_RAW(350) 100Memory of JobsMB
5000 Proc_Time_Create_RAWSI95sec/event(35) 1000
Proc_Time_Create_ESD(1) 25 Proc_Time_Create_AOD(1) 20. Base Model
used
-
- Reconstruction of 10 7events : RAW--> ESD --> AOD -->
TAG at CERN Its the production while the data are coming from the
DAQ (100 days of running collecting a billion of events per
year)
-
- Analysis of 5 Working Groups each of 25 analyzers on TAG only
(no request to higher level data samples).Every analyzer submit 4
sequential jobs on 10 6events. Each analyzer work start-time is a
flat random choice in the range of 3000 seconds. Each analyzer data
sample of 10 6events is a random choice in the complete data sample
of TAG DataBase consisting of 10 7events.
-
- Transfer (FTP) of a 10 7events ESD, AOD and TAG from CERN to
RC
- CERN Activities : Reconstruction, 5 WG Analysis, FTP
transfer
- RC Activities : 5 (uncorrelated) WG Analysis, receive FTP
transfer
-
- Single Analysis Job : 1.67 CPU hours at CERN = 6000 sec at CERN
(same at RC)
-
- Reconstruction at CERN for 1/500 RAW to ESD : 3.89 CPU hours =
14000 sec
-
- Reconstruction at CERN for 1/500 ESD to AOD : 0.03 CPU hours =
100 sec
21. Resources: LAN speeds ?!
- In our Models the DB Servers are uncorrelated and thus one
activity uses a single Server.The bottlenecks are theread and
writespeed to and from the Server. In order to use the CPU power at
reasonable percentage we need a read speed of at least300 MB/sand a
write speed of100 MB/s(milestone already met today)
- We use100 MB/sin current simulations (10 Gbits/sec switched
LANs in 2005 may be possible).
- Processing node link speed is negligible in our
simulations.
- Of course the real implementation of the Farms can be
different, but the results of the simulation do not depend on real
implementation: they are based on usable resources.
Seefollowingslides 22. More realistic values for CERN and RC
- Data Link speeds at100 MB/sec(all values) except :
-
- Node_Link_Speed at 10 MB/sec
-
- WAN Link speeds at 40 MB/sec
-
- 1000 Processing nodes each of 500 SI95
-
- 200 Processing nodes each of 500 SI95
1000 Processing nodes times 500SI95 = 500kSI95about the CPU
power of CERN Tier0 disk space as for the number of DBs 100kSI95
processing Power = 20% CERN disk space as for the number of DBs 23.
Overall Conclusions
- MONARC simulation tools are:
-
- sophisticated enough to allow modeling of complex distributed
analysis scenarios
-
- simple enough to be used by non experts
- Initial modeling runs are alkready showing interesting
results
- Future work will help identify bottlenecks and understand
constraints on architectures
24. MONARC Phase 3
- More Realistic Computing Model Development
- Confrontation of Models with Realistic Prototypes;
- At Every Stage:Assess Use Cases Based on Actual Simulation,
Reconstruction and Physics Analyses;
-
- Participate in the setup of the prototyopes
-
- We will further validate and develop MONARC simulation system
using the results of these use cases (positive feedback)
-
- Continue to Review Key Inputs to the Model
-
-
- CPU Times at Various Phases
-
-
- Tape Storage: Speed and I/O
- Employ MONARC simulation and testbeds to study CM
variations,and suggest strategy improvements
25. MONARC Phase 3
-
-
- Reclustering, Restructuring; transport operations
-
-
- Caching, migration (HMSM), etc.
-
-
- QoS Mechanisms: Identify Which are important
-
- Distributed System Resource Management and Query
Estimators
-
-
- (Queue management and Load Balancing)
- Development of MONARC Simulation Visualization Tools for
interactive Computing Model analysis
26. Relation to GRID
- The GRID project is great!
-
- Development of s/w toolsneededfor implementing realistic LHC
Computing Models
-
-
- farm management, WAN resource and data management, etc.
-
- Help in getting funds for real life testbed systems (RC
prototypes)
- Complementarity GRID-MONARC hierarchical RC Model
-
- Hierarchy of RC is asafeoption. If GRID will bring big
advancements, less hierarchical models should alo become
possible
-
- MONARC Phase-3 to last~1 year: bridge to GRID project starting
early in 2001
-
- Afterwards common work by LHC experiments for developping the
computing models will surely be still needed:in which project
framework and for how long we will see then...