19
1 ALICE Grid Status ALICE Grid Status David Evans David Evans The University of The University of Birmingham Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

Embed Size (px)

Citation preview

Page 1: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

1

ALICE Grid StatusALICE Grid Status

David EvansDavid Evans

The University of BirminghamThe University of Birmingham

GridPP 14th Collaboration MeetingBirmingham 6-7 Sept 2005

Page 2: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

2

The ALICE ExperimentThe ALICE Experiment

ALICE is one of the four main LHC ALICE is one of the four main LHC experiments at CERN.experiments at CERN.

Only one dedicated to heavy-ion physics.Only one dedicated to heavy-ion physics.– Study of QCD under extreme conditionsStudy of QCD under extreme conditions

~ 1000 collaborators~ 1000 collaborators ~ 100 institutions~ 100 institutions Birmingham is only Birmingham is only

UK institute involvedUK institute involved

Page 3: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

3

UK ALICEUK ALICE

Birmingham only UK institute in ALICEBirmingham only UK institute in ALICE Small group but plays a vital roleSmall group but plays a vital role

– Responsible for the design, construction, building and Responsible for the design, construction, building and maintenance of Central Trigger Processor and Local maintenance of Central Trigger Processor and Local Trigger Units.Trigger Units.

– Getting involved in Getting involved in physics exploitation.physics exploitation.– No surplus UK manpower No surplus UK manpower available for Grid work (ALICE available for Grid work (ALICE Gridd experts at CERN).Gridd experts at CERN).

» i.e. no UK ALICE Gridd expertsi.e. no UK ALICE Gridd experts..

Page 4: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

4

ALICE RequirementsALICE Requirements

Data taking (each year)Data taking (each year)– 1 month of Pb-Pb data ~ 1 PByte1 month of Pb-Pb data ~ 1 PByte– Also p-p for rest of the year ~ 1 PByteAlso p-p for rest of the year ~ 1 PByte

Large scale simulation effortLarge scale simulation effort – 1 Pb-Pb event: ~ 24 hrs (1 GHz)1 Pb-Pb event: ~ 24 hrs (1 GHz)

Data ReconstructionData Reconstruction Data analysisData analysis Smaller Collaboration than Smaller Collaboration than

ATLAS or CMS but similar ATLAS or CMS but similar computing requirements.computing requirements.

Page 5: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

5

Computing RequirementsComputing Requirements

Data processing (pp)Data processing (pp)– Calibration & alignment – (quasi) onlineCalibration & alignment – (quasi) online– First reconstruction pass during data takingFirst reconstruction pass during data taking

» Establish overall properties quicklyEstablish overall properties quickly

– Followed by tuning passFollowed by tuning pass– Followed by second reconstruction passFollowed by second reconstruction pass

Data processing (Pb-Pb)Data processing (Pb-Pb)– Calibration & alignment during data takingCalibration & alignment during data taking– First reconstruction pass ~ 4 monthsFirst reconstruction pass ~ 4 months– Second reconstruction pass ~ 6 monthsSecond reconstruction pass ~ 6 months

Page 6: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

6

Computing RequirementsComputing Requirements

Monte Carlo SimulationsMonte Carlo Simulations– pp data: generate similar amount of MC data ~10pp data: generate similar amount of MC data ~1099

eventsevents

– Pb-Pb data: generate ~ 10Pb-Pb data: generate ~ 1077 events events

(factor 10 less than real data)(factor 10 less than real data)

Page 7: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

7

Profile of CPU Profile of CPU requirementsrequirements

Total

CERN T0

CERN T1

Ext Tier 1

Ext Tier 2

35 MSK2K

Jan 07 Sept 08 Nov 09

Page 8: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

8

Tier HierarchyTier Hierarchy

MONARC ModelMONARC Model

‘‘Cloud Model’ (Tier free) used Cloud Model’ (Tier free) used in ALICE data challenges (native AliEn sites – for LCG site in ALICE data challenges (native AliEn sites – for LCG site

we comply with Tier model)we comply with Tier model)

Tier 0RAW data master copyData reconstruction (1st pass)Prompt analysis

Tier 1Copy of RAWreconstructionScheduled analysis

Tier 2MC productionPartial copy of ESDData analysis

Page 9: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

9

ALICE Gridd - AliEnALICE Gridd - AliEn

AliEn (ALICE Environment) – Grid framework AliEn (ALICE Environment) – Grid framework developed by ALICE – developed by ALICE – used in production for > 4 used in production for > 4 yearsyears..

Based on WEB services and standard protocols.Based on WEB services and standard protocols. Built around open source codeBuilt around open source code

– Less than 5% is native AliEn code (mainly PERL).Less than 5% is native AliEn code (mainly PERL).

To date, To date, > 500,000> 500,000 ALICE jobs have been run ALICE jobs have been run under AliEn control worldwide.under AliEn control worldwide.

Page 10: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

10

First implementation of Alice World Computing ModelFirst implementation of Alice World Computing Model

AliEn@GRIDAliEn@GRID

Page 11: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

11

Old AliEn FrameworkOld AliEn Framework

A liE n S e rvic e s

R D B M S

D BD rive r

U se rInte r fac e

B aseC lie nt

C lus te rM o nito r

B aseC lie nt

P roc e s sM onitor

B ase

C lie nt

C o m putingEle m e nt

B aseC lie nt

C E

Alic eAtlas

Sto rageEle m e nt

B aseC lie nt

SE

FTD

B ase

C lie nt

IS

B ase

C lie nt

D BP ro xy

B aseC lie nt

Lo gger

B ase

C lie nt Authen

B aseC lie nt CP U

S erver

B ase

C lie nt

Se rve rSe rve r

Se rve r

U se rApplic atio n

(C /C + + )

LD AP

W e bP o rtal

B ase

C lie ntP o r tal

R B

B ase

C lie nt

Se rve rSe rve r

100% perl5

SOAP

LocalSiteelements

Centralservices

User

Page 12: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

12

AliEn ‘Pull’ ProtocolAliEn ‘Pull’ Protocol

One of the major differences between ALiEn and LCG One of the major differences between ALiEn and LCG grids is that AliEn uses the ‘grids is that AliEn uses the ‘pullpull’ rather than ‘’ rather than ‘pushpush’ protcol.’ protcol.

EDG/Globus model:EDG/Globus model:

ALiEn model:ALiEn model:

user server

ResourceBroker

user server

ResourceBroker

job

list

Page 13: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

13

Resource BrokerResource Broker

T ier0

T AS K Q UEUE

CP US erver

ACCT

REM O T ES IT E

Rem oteQ ueue

Clus terM onitor

J ob

1P ro ce ssM o n ito r

J ob

1P ro ce ssM o n ito r

J ob

2P ro ce ssM o n ito r

J ob

nP ro ce ssM o n ito r

ACCT

REM O T E S IT Eor

AN O T HERG RID

Rem oteQ ueue

Clus terM onitor

AliEnS erver

EDG /G lo b us

“Pull” instead of traditional “Push”

architecture

Broker

Authen

Logger

TransferBroker

IS

TransferOptimiser

EDG/Globus

Page 14: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

14

EGEE / gLiteEGEE / gLite

ALICE is committed to using as much common grid ALICE is committed to using as much common grid applications as possible.applications as possible.

In the framework of the EGEE project (EU funded grid In the framework of the EGEE project (EU funded grid

project) middleware (gLite) is being developed.project) middleware (gLite) is being developed.– ALICE was playing a full role in this project – not so much nowALICE was playing a full role in this project – not so much now ! !

ChangesChanges have been made to make AliEn work with LCGhave been made to make AliEn work with LCG– E.g. changes to File Catalogue (FC) E.g. changes to File Catalogue (FC) LFC (Local File Catalogue LFC (Local File Catalogue

or LCG File Catalogue) or LCG File Catalogue)

– V0 Box at Tier 1 V0 Box at Tier 1

– Globus/GSI compatible authenticationGlobus/GSI compatible authentication

Page 15: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

15

AnalysisAnalysis

Core of ALICE computing model is AliRootCore of ALICE computing model is AliRoot– Uses ROOT frameworkUses ROOT framework

Couple AliEn with ROOT for Grid-based analysis.Couple AliEn with ROOT for Grid-based analysis.– Use PROOF – Parallel ROOT Facility Use PROOF – Parallel ROOT Facility

– To the user it’s like using ROOTTo the user it’s like using ROOT

4-tier architecture: 4-tier architecture: – ROOT client session, API server (AliEn + PROOF), ROOT client session, API server (AliEn + PROOF),

Site PROOF master servers, PROOF slave servers. Site PROOF master servers, PROOF slave servers.

Page 16: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

16

PROOFPROOF

Each node has PROOF slave

Each site has PROOF master server

Uses ‘pull’ protocol i.e. the slaves ask the master for work packets.Slower slaves get smaller work packets etc.

ClientAPI

APIServer

AliEnFC….

List of sites with

data

Page 17: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

17

Authentication - SASLAuthentication - SASL

SASL is the Simple Authentication and Security Layer, a SASL is the Simple Authentication and Security Layer, a method for adding authentication support to connection-based method for adding authentication support to connection-based protocols.protocols.

AliEn now has perl module with implementation GSSAPIAliEn now has perl module with implementation GSSAPI

This allows us toThis allows us to – Use all SASL authentication schemes Use all SASL authentication schemes – Use old AliEn authentication (token, AFS password, SSH) Use old AliEn authentication (token, AFS password, SSH) – Use X509 certificates and Globus/GSI (AliEn distribution now Use X509 certificates and Globus/GSI (AliEn distribution now

includes necessary Globus/GSI software) includes necessary Globus/GSI software) – Develop secure Peer-To-Peer File Transfers based on Develop secure Peer-To-Peer File Transfers based on

machine/protocol/user certificates and LDAP based configuration machine/protocol/user certificates and LDAP based configuration managementmanagement

Page 18: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

18

AuthenticationAuthentication

ClientProxy Server

DatabaseLDAP

Request methods

List of methods

SASL AuthenticationChecking if user

exists

Data Data

X509(AliEn/Globus)PKI/RSA (ssh)Token (AliEn)AFS password

Server

Page 19: 1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005

19

SummarySummary

AliEn is a Grid framework developed by ALICE AliEn is a Grid framework developed by ALICE using 95% open source code (e.g SOAP) and 5 % using 95% open source code (e.g SOAP) and 5 % AliEn specific (perl) code.AliEn specific (perl) code.– Successfully used over past 3 yearsSuccessfully used over past 3 years

ALICE wishes to use as many common grid ALICE wishes to use as many common grid solutions as possiblesolutions as possible

AliEn evolving to take into account EGEE/gLite AliEn evolving to take into account EGEE/gLite framework and to work with LCG.framework and to work with LCG.– New user interfaces being developed New user interfaces being developed – PROOF for analysis being developedPROOF for analysis being developed– Better authentication/authorisation being developedBetter authentication/authorisation being developed– Etc.Etc.