View
3
Download
0
Category
Preview:
Citation preview
Beyond High Performance
Computing:
What Matters to CERN
Pierre VANDE VYVRE for the ALICE Collaboration
ALICE Data Acquisition Project Leader
CERN, Geneva, Switzerland
P. Vande Vyvre - CERN
CERN
• CERN is the world's
largest particle physics centre
funded in 1954 by 20 European member states
• Particle physics is about:
– elementary particles of which all matter in the universe is
made
– fundamental forces which hold matter together
• Particles physics requires:
– special tools to create new particles: the accelerators
– special instruments to study new particles: the experiments
P. Vande Vyvre-CERN
2
What’s Next ?Physics of 21st century will explore unresolved physics questions:
• What's the origin of the mass of particles?
⇒ Search for Higg’s boson
• Can all the forces be unified?
⇒ Grand Unification Theory (GUT)
• "Dark matter and energy" (96 % of universe) is not visible
What is it made of?
⇒ Search for new particles forming the dark matter
• Where did the antimatter go?
⇒ Explore asymmetry between matter and antimatter
Explore the matter deeper and deeper using a new accelerator:
the LHC
P. Vande Vyvre-CERN
3
Large Hadron Collider (LHC)
Geneva airport
LHC27 km circumference
100 meters underground
P. Vande Vyvre-CERN
4
Inside the LHC
P. Vande Vyvre-CERN
5
The Experiments:
Underground “Cathedrals”
Computing
Centre
P. Vande Vyvre-CERN
6
ALICE Experiment at LHC
Detector:
18 technologies
Size: 16 x 26 meters
Weight: 10,000 tons
Collaboration:
> 1000 Members
> 100 Institutes
> 30 countries
A brief history of ALICE1990-1996: Design
1992-2006: R&D
2000-2010: Construction
2002-2007: Installation
2008-2010: Commissioning
2009-2014: Operation
2008-2015: UpgradeP. Vande Vyvre-CERN
7
Data Deluge
• 100 million measurement channels
• 40 million potential collisions/second
P. Vande Vyvre-CERN
8
Data Selection
P. Vande Vyvre-CERN
9
Multi-level trigger system(40 MHz → a few kHz)
Reject background
Select most interesting collisions
Reduce total data volume
Data Acquisition
Data Acquisition:
Acquire data from a matrix
made of millions of channel
and pertaining to the same
physics event (particle
collision)
10
P. Vande Vyvre-CERN
Data Quality Monitoring
Selection: no undo possible
→
Data Quality Monitoring:
fast verification of the event
selection
P. Vande Vyvre-CERN
11
Trigger - Data Acquisition - Offline
Drawing by S. Chapeland
Trigger (decision) + Data Acquisition (action) – Processing
Online Offline
P. Vande Vyvre-CERN
12
Collision DetectorsEvent
FragmentsComplete
Event Data
StorageReconstruction
& Analysis
The Mission of Trigger-Data Acquisition
in a Physics Experiments
Particles
Data
Information
Knowledge
Inspect data resulting from all collisions
Select the most interesting ones
Store them and make them available for off-line analysis
Data is our most precious resource !
P. Vande Vyvre-CERN
13
Data Produced at CERN
P. Vande Vyvre-CERN
14
Recording
(Mass Storage)
(Events/s) (MB/s)
Data
Archived
Total/Yr
(PB)
ALICE Pb-Pb 200 1250 2.3
ATLAS pp 100 100 6.0
CMS pp 100 100 3.0
LHCb pp 200 40 1.0
Data Volume from Experiments
• For each of the 4 experiments:
– 100 - 200 collisions of interest per second
– 1 - 50 MB of digitised information for each
collision
– 160 – 180 days of data taking per year
– Recording rate of 0.1 – 1 GB/sec
– 1- 6 Petabyte/year/experiment
• For the whole of CERN:
~ 15 PB/year during 15 years
P. Vande Vyvre-CERN
15
Challenge 1: Data Throughput
The overall move of LHC data correspondsto 1 DVD per secondduring the whole year
P. Vande Vyvre-CERN
16
Challenge 2: Data Volume
The LHC data corresponds to 1 pile of CDs as high as the Mont-Blanc (4800 m)
every year for each experiment
P. Vande Vyvre-CERN
17
Challenge 3: Data Access– 15 PB/yr used by 8000 physicists
from 580 institutes in 85 countries
– The LHC Computing Grid (LCG): Share the data and pool the resources
–The world's largest international scientific Grid.
P. Vande Vyvre-CERN
18
ALICE DAQ Architecture
GDC TDSMTDSMGDC
CTP
LTU
TTC
FERO FERO
LTU
TTC
FERO FERO
LDCLDC
BUSY BUSY
Rare/All
Event
Fragment
Sub-event
Event
File
Storage Network
TDS
L0, L1a, L2
L0, L1a, L2
360 DDLs
EDM
LDCLoad Bal. LDC LDC
HLT Farm
FEPFEP
DDL
H-RORC
10 DDLs
10 D-RORC
10 HLT LDC
120 DDLs
TDS
Event Building Network
430 D-RORC
125 Detector LDC
75 GDC
30 TDSM
75 TDS
PDS
DA
DQMDSS
18 DSS60 DA/DQM
Archiving on Tape
in the Computing
Centre (Meyrin)
ACR
30 ACR
P. Vande Vyvre-CERN
19
Storage System: Key Design Factors
• Large project with a long duration: long term support, evolution and
interoperability are key design factors
First design of the Data Acquisition System in 1995
… Together with the budget request!
• R&D and tests:
- SNW conference: technology in action, contacts with vendors
- Validation in our lab of the Fibre Channel technology (FC2G)
- Selection of hw components. Loan of components is essential!
• Architecture coherence
– Selection of the cluster file system
• Staged Deployment and commissioning
- Adoption of FC4G
- Deployment in several stages (10, 20, 40, 100 % of the
performance). The SAN has nicely followed this evolution
- Be ready for the unforeseen
- Data quality monitoring: larger processing power than
anticipated together with a higher throughput data access
- Use of IP-based clients of the cluster file system
ALICE Data Storage Architecture• 180 ports FC4G
• 75 Transient Data Storage (TDS) storage arrays
each divided in 3 volumes
• Nodes accessing data over FC4G:
• 75 data formatting and writing-only nodes (GDC)
• 30 reading-only nodes exporting data files
to Permanent Data Storage (PDS)
• Nodes accessing data over IP
• 90 nodes for data quality monitoring
• Unique logical view by cluster file sharing!
GDC TDSMTDSMGDC
Storage Network (FC4G)
TDS TDS
Event Building Network (Gbit Ethernet)
75 GDC
30 TDSM
75 TDS
PDSArchiving on Tape
in the Computing
Centre
DA
DQMACR
60 DA/DQM 30 ACR
P. Vande Vyvre-CERN
21
Data Recording and Formatting
Performance
Max Event
Building Bw110 MB/s
Max Data
Recording Bw390 MB/s
• GDC
– 75 PCs based on 4-core
Nehalem
– Linux RH SLC4
Characteristics
CPU 2 E5520 2.26 GHz
Chipset
FSB
Memory 12 GB
DDR3 1066
PCI-X 1 x (64 bit 133 MHz)1 QLOGIC QLA 2460 -
FCS HBA 4 Gbps
Format 1U
P. Vande Vyvre-CERN
22
SAN Evolution over Time
FC4G 16 ports
Server
Storage
FC4G 16 ports
Server
Storage
2003-2008 (hw deployment 10-40% performance):
- 1-4 FC switches (16 FC4G, 4 FC10G)
- Up to 30 nodes
- Up 30 storage arrays
FC4G 16 ports
Server
Storage
FC4G 16 ports
Server
Storage
P. Vande Vyvre-CERN
23
SAN Evolution over Time
P. Vande Vyvre-CERN
24
FC4G 16 ports
FC4G 16 ports
Server
Storage
Server
Storage
FC4G 16 ports
Server
Storage
FC4G 16 ports
Server
Storage
FC10G 4 ports
FC4G 32 ports
FC10G 4 ports
FC4G 32 ports
Server
Storage
Server
Storage
Server
Storage
IP relay
Storage
2009: (hw deployment 100% performance)
- 2 enterprises FC switches (8 slots, 16 FC4G, 4 FC10G)
- 100+ nodes over FC4G, 90 nodes over IP
- 75 storage arrays
Server
Storage Network
• Storage network:
– Fibre Channel
switches FC 4G
– 2 enterprise
FC4G switches
– 4 stackable
FC4G switches
P. Vande Vyvre-CERN
25
Transient Data Storage System
• Logical model: distributed file-system
• Hardware
– Transient Data Storage (TDS)
• Located at the experimental area
• Capacity: a day of autonomous data taking
• Before archiving to permanent data storage
– Permanent Data Storage
• Located in the computing centre (3 km away)
• Infinite capacity, very low cost
• TDS Software implementation
– TDS sharing between all nodes and applications
P. Vande Vyvre-CERN
26
Storage Arrays
• Transient Data Storage (185 TB of RAID-6 disk buffer at the experimental area)
–Storage arrays 2 FC 4G ports16 SATA II HD 250 GB
P. Vande Vyvre-CERN
27
Use of a Cluster File System • Requirements
– Maximize performance: absolute aggregate bandwidth performance (ALICE needs up to 3 GByte/s for write and read traffic)
– Minimize footprint required for the hardware equipment:the ALICE DAQ counting room is only 60 m2
– Scalability for the number of clients (100+ user nodes)
– Hardware neutral
• 2 Products tested in depth
– Both solid products. No product had any severe problem during these tests.
– Performance was the key selection criteria. Performance target: 1/5 of the global performance required:300 MB/s writing combined with 300 MB/s reading
– The support given by the two companies during the installation and the tuning of the products was very good in both cases.
• Hardware virtualization
• Ensure high performance while adding flexibility
P. Vande Vyvre-CERN
28
Cluster File System Tests (2)
P. Vande Vyvre-CERN
29
SN + TG
SN
Affinities
• Make file system across all Racks “R”
• global file space
• Add “AFFINITIES”
• Certain directories are
bound to certain
Containers
C1 C2 C3 C4
C5 C6 C7 C8
C9 C10 C11 C12
C13 C14 C15 C16
C17 C18 C19 C20
C21 C22 C23 C24
R1
R2
R3
/filespace/global_file.raw
/filespace/R3/C22/
/filespace/R3/C22/local_file.raw
P. Vande Vyvre-CERN
30
Cluster File System:
Sharing of hundreds of disks by tens of PCs
Global file system (system management)
Minimal performance penalty
Each stripe group has an affinity
(assignment of data to specific
storage class)
First pp Collision 23-Nov-09
First Physics Paper 01-Dec-09
P. Vande Vyvre-CERN
31
• Science is also moving faster and faster!
• Limited time to celebrate after the first collision
• A dramatic race between the 4 experiments for the first paper
• ALICE paper published only 8 days after data taking!
Being Prepared for the Heavy-Ion Run
in November
Movers
32
P. Vande Vyvre-CERN
Data
Writing
Data Archiving
LHC Upgrade
• LHC accelerator will be upgraded in 2020…
… and even more data will be produced
• A better selection mechanism and
more accurate quality monitoring
have to be put in place
• Data storage will also require higher
performances and a faster access
• New technological progress will be required
for the next steps of Physics!
P. Vande Vyvre-CERN
33
P. Vande Vyvre-CERN
33
Conclusion
• The performances promised by future
technologies have been essential in the
’90s to design the LHC accelerator and
the experiment
• 4 GB/s of sustained data throughput
• 15 PB/year
• Successful integration of computing and
storage equipment from several vendors
• Cluster file system provides the overall
architectural coherency
• A new quantum leap is now needed for
the LHC upgrade in a decade
P. Vande Vyvre-CERN
34
Recommended