36
Big Data Management at CERN: The CMS Example J.A. Coarasa CERN, Geneva, Switzerland DBTA Workshop on Big Data, Cloud Data Management and NoSQL October 10th 2012, Bern, Switzerland

Big Data Management at CERN: The CMS Example

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Big Data Management at CERN: The CMS Example

Big Data Management at CERN:The CMS Example"

J.A. Coarasa "CERN, Geneva, Switzerland"

DBTA Workshop on Big Data, Cloud Data Management and NoSQL

October 10th 2012, Bern, Switzerland

Page 2: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Outline"

•  Introduction"• The Large Hadron Collider (LHC)"• The 4 big experiments"

•  Data Along the Data Path:"• Origin of the DATA"• DB for the DATA"• Other Data"• The Big (experimental) DATA "

J.A. Coarasa 2

Page 3: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

•  Introduction"•  The Large Hadron Collider (LHC)"•  The 4 experiments"

J.A. Coarasa 3

Page 4: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

The Large Hadron Collider (LHC)"

J.A. Coarasa (CERN) 4

Page 5: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

The Large Hadron Collider (LHC)"

J.A. Coarasa (CERN) 5

Page 6: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

LHC Multipurpose Experiments"

J.A. Coarasa (CERN) 6

ATLAS A Toroidal LHC ApparatuS

µ

CMS Compact Muon Solenoid

µ

Page 7: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

LHC Specific Experiments"

J.A. Coarasa (CERN) 7

The ALICE Collaboration built a dedicated heavy-ion detector to study the physics of strongly interacting matter at extreme energy densities, where the formation of a new phase of matter, the quark-gluon plasma, is expected.

ALICE A Large Ion Collider Experiment

LHCb (Study of CP violation in B-meson

decays at the LHC collider)

Page 8: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

LHC Trigger and DAQ Summary"

J.A. Coarasa (CERN) 8 "

Trigger !Level-0,1,2 !Event !Readout !HLT Out!No. Levels ""Rate (Hz) "Size (Byte) "Bandw.(GB/s) "MB/s (Event/s)""

"3 ! LV-1 105 !1.5x106 !4.5 !300 (2x102)"

! LV-2 3x103 !!""2 ! LV-1 105 !106 !100 !O(1000) (102)" !!!! !!

""2 !LV-0 106 !3x104 !30 !40 (2x102) "4 !Pb-Pb 500 !5x107 !25 !1250 (102)" !p-p 103 !2x106 ! !200 (102)"

CMS!

ATLAS!

LHCb!

ALICE!

Page 9: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Trigger and DAQ trends in HEP"

J.A. Coarasa (CERN) 9

Page 10: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

•  Data Along the Data Path:"•  Origin of the DATA"•  DB for the DATA"•  Other Data"•  The Big DATA "

J.A. Coarasa 10

Page 11: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

The CMS Experiment"•  The collaboration has around 4300

active members"– 179 institutes"– 41 countries"

•  The Detector"– Weight: 12,500t"– Diameter: 15m"– Length: 21.6m"– Magnetic field: 3.8T"– Channels: ~70,000,000"

J.A. Coarasa (CERN) 11

Page 12: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS: The Data Path"

J.A. Coarasa (CERN) 12

Raw Data: 100 Gbit/s

Events: 20 Gbit/s

Controls: 1 Gbit/s

To Tier-1 centers

Controls: 1 Gbit/s

Page 13: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS: Collisions Overview"

J.A. Coarasa (CERN) 13

Collision rate

Event Rates: ~109 Hz Event Selection: ~1/1013

Page 14: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS: Data Origin, The DAQ"

J.A. Coarasa (CERN) 14

Large Data Volumes (~100 Gbytes/s data flow, 20TB/day)!–  After Level 1 Trigger ~100 Gbytes/s (rate ~O(100) kHz) reach the event

building (2 stages, ~2000 computers).!–  HLT filter cluster select 1 out 1000. Max. rate to tape: ~O(100) Hz "

⇒  The storage manager (stores and forwards) can sustain a 2GB/s traffic.!

⇒  Up to ~300 Mbytes/s sustained forwarded to the CERN T0. (>20TB/day).!

Detector Front-end

Computing Services

Readout Systems

Builder and Filter Systems

Event Manager Builder Networks

Level 1 Trigger

Run Control

40 MHz

100 kHz

100 Hz

Page 15: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

More than 3000 computers mostly under Scientific Linux CERN 5:"–  640 (2-core) as a 1st stage building, equipped with 2 Myrinet and 3

independent 1 Gbit Ethernet lines for data networking. (1280 cores);"–  1264 (720 (8-core) + 288 (12-core) + 256 (16-core)) as high level

trigger computers with 2 Gbit Ethernet lines. (13312 cores);"–  16 (2-core) with access to 300 TBytes of FC storage, 4 Gbit Ethernet

lines for data networking and 2 additional ones for networking to Tier 0;"

CMS: Online Computing"

J.A. Coarasa 15

Hig

h ba

ndw

idth

netw

orki

ng

Page 16: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

More than 3000 computers mostly under Scientific Linux CERN 5:"–  640 (2-core) as a 1st stage building, equipped with 2 Myrinet and 3

independent 1 Gbit Ethernet lines for data networking. (1280 cores);"–  1264 (720 (8-core) + 288 (12-core) + 256 (16-core)) as high level

trigger computers with 2 Gbit Ethernet lines. (13312 cores);"–  16 (2-core) with access to 300 TBytes of FC storage, 4 Gbit Ethernet

lines for data networking and 2 additional ones for networking to Tier 0;"–  More than 400 used by the subdetectors;"–  90 running Windows for Detector Control Systems;"–  12 computers as an ORACLE RAC;"–  12 computers as CMS control computers;"–  50 computers as desktop computers in the control rooms;"–  200 computers for commissioning, integration and testing;"–  15 computers as infrastructure and access servers;"–  250 active spare computers;"

!⇒ Many different Roles!

CMS: Online Computing"

J.A. Coarasa 16

Hig

h ba

ndw

idth

netw

orki

ng

Page 17: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS: Online Networking"CMS Networks:"–  Private Networks:"

•  Service Network ""(~3000 1 Gbit ports);"

•  Data Network ""(~4000 1Gbit ports)"

–  Source routing on computers"–  VLANs on switches "

•  Central Data Recording ""(CDR). Network to Tier 0."

•  Private networks for Oracle RAC"•  Private networks for subdetectors"

–  Public CERN Network"

J.A. Coarasa (CERN) 17

CMS Networks

CM

S Si

tes

Computer gateways

Readout, HLT Control…

Firewall

Internet

Service Network"

Data Networks

CDR Network

CERN Network

Storage Manager

Page 18: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS: “official” Databases"•  Configuration information"

–  Detectors, DAQ, L1 trigger, HLT…"•  Run, Beam and Luminosity information"

–  Info on which files are written sent to Tier-0, eLog…"•  Offline DB also hosting computing applications"

–  Tier-0 workflow processing, Data distribution service (PhEDEx), Data Bookkeeping Service,…"

•  Conditions data for offline reconstruction and analysis"–  Critical data, exposed to a large community"

J.A. Coarasa (CERN) 18

Page 19: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS Databases: Chalenge"•  Over 75 million channels in various detectors"•  Detector information in each channel"

–  Conditions: Temperature, HV, LV, status…"–  Calibration: pedestals, charge/count…"–  Changes with time (temperature and radiation)"

•  Necessary for performance monitoring"•  Subset used by offline reconstruction and

physics anaysis"–  Conditions data"–  Need to distribute to all Tier-N centres worldwide"

J.A. Coarasa (CERN) 19

Page 20: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS DB Clients: Frontier"•  Offline (or HLT) reconstructions jobs could

create a large load on the DBs"–  Tens of thousands of jobs, few hundred queries

each"•  Frontier squid caches minimize the direct

access to Oracle servers"–  Additional latency as set by the cache refresh

policy"–  Frontier service for Online"

•  Used to distribute configuration and conditions to HLT"–  Frontier service for Offline (Tier-N)"

•  Reading from “Snapshot” from Offline DB"•  Heavily used for reprocessing"

J.A. Coarasa (CERN) 20

Page 21: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS Databases till end of 2011"

J.A. Coarasa (CERN) 21

P5 CERN CC CMSONR

CMINTR

omds

orcon

other

CMSONR Inactive stdby

omds

orcon

other

omds

orcoff

Comput.

CMSR

omds

orcoff

Comput.

CMSR Inactive stdby

Orcoff Snap.

other

CMSARC

Oracle Streams

Oracle Streams Oracle Data Guard

Oracle Data Guard

INT2R INT9R

CMS CERN CC Off-Site

main

test

Oracle 10

Fire

wal

l

Page 22: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS DB space usage"•  DB growth about

1.5Tbyte/year"– Both on/off-line"

•  Condition data is only a small fraction"– ~300Gbyte now"– Growth: +20Gbyte/

year"– ~50 Global Tags /

month"

J.A. Coarasa (CERN) 22

0

1.5

3

4.5

6

Dec 09 Dec 10 Dec 11

DB size in TB

CMSONR CMSR

Page 23: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS DB operations 2011"•  Smooth running"

– CMSONR availability: 99.88%"•  10.5 hours downtime"

– CMSR availability: 99.64%"•  30.7 hours downtime"

– SQL query time stable (few msec)"

J.A. Coarasa (CERN) 23

10 ms

Big Thanks to CERN DBAs !!

Page 24: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS Databases in 2012"

J.A. Coarasa (CERN) 24

P5 CERN CC CMSONR

CMINTR

omds

orcon

other

CMSONR Inactive stdby

omds

orcon

other

omds

orcoff

Comput.

CMSR

omds

orcoff

Comput.

CMSR Inactive stdby

Orcoff Snap.

other

CMSARC

INT2R INT9R

Oracle Streams

Oracle Streams Oracle Data Guard

Oracle Data Guard

CMSONR active stdby

omds

orcon

other

Oracle Data G

uard

CMS CERN CC Off-Site

main

Oracle 11g

Active Data Guard

Active Data Guard

Fire

wal

l

test

Page 25: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Other CMS Documents"

x    4000  people      …  for  many  decades

J.A. Coarasa (CERN) 25

Page 26: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Other CMS Documents: Size"

A printed pile of all CMS documents that are already in a managed system

= 1.0 x (Empire State building)

Plus we have almost the same amount spread all over the place (PCs, afs, dfs,

various  websites  …)

J.A. Coarasa (CERN) 26

Page 27: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Other CMS Documents: Size"No. of CMS Documents, May 2012

J.A. Coarasa (CERN) 27

Page 28: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

CMS website: cms.web.cern.ch"•  Migrated to a

new drupal infrastructure:"– Offers a

coherent view of all CMS information"

J.A. Coarasa (CERN) 28

Public site focus is fresh news – Text, images, links, etc. – Keywords, groups/topics – Author / editor wŽƌŬŇow

Email noƟĮcaƟons, RSS, TwiƩer, FB, G+ …

Promote to home page or features slider

Push to selected CMS groups

Page 29: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

LHC Offline Computing. The GRID"

J.A. Coarasa (CERN) 29

Tier-0 (CERN) : recording – reconstruction – distribution Tier-1 (~10 centres) : storage - reprocessing – analysis Tier-2 (~140 centres) : simulation – end-user analysis

The GRID. A distributed computing infrastructure (~150 kCores), uniting resources of HEP institutes around the world to provide seamless access to CPU and storage for the LHC experiments. A common solution for an unprecedented demand (in HEP) of computing power for physics analysis.

Page 30: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Scale-free Networks"

J.A. Coarasa (CERN) 30

On/Off-line TDAQ (and GRID) systems are, by construction, scale-free systems; they are capable of operating efficiently, taking advantage of any additional resources that become available or as they change in size or volume of data handled. Other complex systems. e.g. the Word Wide Web, show the same behavior. This is the result of the simple mechanism that allows networks to expand by the addition of new vertices which are attached to existing well-connect vertices.

On-line (TDAQ)

Off-line (GRID)

Size

Per

form

ance

Scale-free internet (2002 snapshot)!

Page 31: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

LHC Storage: Sizes"

J.A. Coarasa (CERN) 31

152 M

280 M

18 M5 M

2YQFIV�SJ�QEREKIH�½PIW�� %TVMP�����

ALICE ATLAS CMS LHCB

0 PB

200 PB

400 PB

179194

169

188

2012 2013

'67+�6IGSQQIRHEXMSRWDISK TAPE

'SQTEVEFPI�EQSYRX�SJ�(MWO�ERH Tape StorageLHC Storage

... the storage is aggregated and virtualized by experiment

frameworks

348 PB 382 PB8LMW�Ere onP]�QEREKIH�½PIW��XLIre are morI�YWIV�½PIW

11

Page 32: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

LHC Storage: Big Instances"

J.A. Coarasa (CERN) 32

0 PB

6.5 PB

13 PB13

10

5

10

FNAL dCacheCERN EOSATLAS

FZK dCacheBNL

Volume

0 Mio

35 Mio

70 Mio

10

38

10

65

FNAL dCacheCERN EOSATLAS

FZK dCacheBNL

Objects

0

2500

5000

1200

5000

2000FNAL dCache

CERN EOSATLASFZK dCache

Devices

0

150

300300

230

40FNAL dCache

CERN EOSATLASFZK dCache

Server Nodes

LHC Storage12

Page 33: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Big Storages: Sizes"

J.A. Coarasa (CERN) 33

Page 34: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Big Storages: Number of Instances"

J.A. Coarasa (CERN) 34

Page 35: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

Coincidence"

J.A. Coarasa (CERN) 35

1997 CERN. A LHC event builder prototype

1997 Stanford. A Web search engine prototype

2008 The CMS HLT center on CESSY and hundreds Off-line GRID computing centres 105 cores

2008 One of Google data center 106 cores

Page 36: Big Data Management at CERN: The CMS Example

DBTA Workshop on Big Data, Cloud Data Management and NoSQL Big Data Management at CERN: The CMS Example

""""""""

Thank you. Questions?"J.A. Coarasa 36