35
DoE Review Julian Bunn Julian Bunn California Institute of Technology California Institute of Technology July 22, 2004 July 22, 2004 CMS and LHC Software and Computing CMS and LHC Software and Computing The Caltech Tier2 and Grid Enabled The Caltech Tier2 and Grid Enabled Analysis Analysis Tier Tier 2 MonALISA MonALISA GAE GAE

CMS and LHC Software and Computing The Caltech Tier2 and Grid Enabled Analysis

  • Upload
    hye

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

CMS and LHC Software and Computing The Caltech Tier2 and Grid Enabled Analysis. GAE. Tier2. MonALISA. Overview. CMS Computing & Software: Data Grid / Challenges Grid Projects Grid Analysis Environment (GAE) Tier2 Distributed Physics Production Calorimetry/Muon Software Conclusion. - PowerPoint PPT Presentation

Citation preview

Page 1: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

DoE Review Julian BunnJulian Bunn

California Institute of TechnologyCalifornia Institute of TechnologyJuly 22, 2004July 22, 2004

CMS and LHC Software and ComputingCMS and LHC Software and Computing The Caltech Tier2 and Grid Enabled The Caltech Tier2 and Grid Enabled

AnalysisAnalysis

TierTier22

MonALISMonALISAA

GAGAEE

Page 2: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

OverviewOverview

CMS Computing & Software: Data Grid / CMS Computing & Software: Data Grid / ChallengesChallenges

Grid ProjectsGrid Projects Grid Analysis Environment (GAE)Grid Analysis Environment (GAE) Tier2Tier2 Distributed Physics ProductionDistributed Physics Production Calorimetry/Muon SoftwareCalorimetry/Muon Software Conclusion Conclusion

Page 3: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

CMS Computing and Software:

Data Grid model Organization

ScopeRequirements

Data Challenges

CMS Computing and Software:

Data Grid model Organization

ScopeRequirements

Data Challenges

Page 4: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

LHC Data Grid Hierarchy:Developed at Caltech

LHC Data Grid Hierarchy:Developed at Caltech

Tier 1

Tier2 Center

Online System

CERN Center PBs of Disk;

Tape Robot

FNAL CenterIN2P3 Center INFN Center RAL Center

InstituteInstituteInstituteInstitute

Workstations

~100-1500 MBytes/sec

10 Gbps

0.1 to 10 Gbps

Tens of Petabytes by 2007-8.An Exabyte by ~2015.Physics data cache

~PByte/sec

~10 Gbps

Tier2 CenterTier2 CenterTier2 Center

~2.5-10 Gbps

Tier 0 +1

Tier 3

Tier 4

Tier2 Center Tier 2

Experiment

CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1

Emerging Vision: A Richly Structured, Global Dynamic System

Page 5: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

CPT ProjectCPT Project

CCSCore Computing & Software

PRSPhysics Reconstruction and

Selection

TriDASOnline Software

Computing Centres

CMS Computing Services

Architecture, Frameworks / Toolkits

Tracker / b-tau

Online Farms

Online (DAQ) Software Framework

Production Processing & Data Management

Software Users and Developers Environmnt

E-gamma / ECAL

11. HCAL Jets, MEt

Muons

SCB Chair 1996-2001;CCS/CPT Management Board from 2002-2003

Physics GroupsPhysics Groups Higgs

SUSY & Beyond SM

Standard Model

Heavy IonsLeading/significant Leading/significant Caltech activityCaltech activity

USCMS LeadershipUSCMS Leadership

Page 6: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

DC04 Data ChallengeDC04 Data Challenge

PICBarcelona

FZKKarlsruhe

CNAFBologna

RALOxford

IN2P3Lyon

T1

T1

T1

T1

T1 T0

FNALChicago

T1

T0 at CERN in DC0425 Hz input event rate (Peak)Reconstruct quasi-realtimeEvents filtered into streamsRecord raw data and DSTDistribute raw data and DST to T1’s

T1 centres in DC04Pull data from T0 to T1 and storeMake data available to PRSDemonstrate quasi-realtime “fake”

analysis of DST’s

T2 centres in DC04Pre-challenge production at > 30 sitesModest tests of DST analysis

Concentrated on the “Organized,Collaboration-Managed” Aspects of Data Flow and Access

T2Tier2 @ Caltech, UFlorida,

UCSD

30 Million T0 events processed

Page 7: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

CMS Computing and Core Software (CCS) Progress

CMS Computing and Core Software (CCS) Progress

DC04 (5% complexity):DC04 (5% complexity): Challenge is complete but Challenge is complete but post-mortem write-ups are still in progresspost-mortem write-ups are still in progress

Demonstrated that the system can work for well controlled data flow and analysis, and a few expert users

Next challenge is to make this useable by average physicists and demonstrate that the performance scales acceptably

CCS Technical Design Report (TDR):CCS Technical Design Report (TDR): Aligned with Aligned with LCG TDR submission (July 2005)LCG TDR submission (July 2005)

DC05 (10%):DC05 (10%): Challenge Autumn 2005 to avoid Challenge Autumn 2005 to avoid “destructive interference” with Physics TDR “destructive interference” with Physics TDR analysisanalysis

Page 8: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

CMS Computing and Software:

Grid Projects

CMS Computing and Software:

Grid Projects

Page 9: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

TrilliumTrilliumCoordinates PPDG, GriPhyN and iVDGL. Coordinates PPDG, GriPhyN and iVDGL. DoE and NSF working together: DOE DoE and NSF working together: DOE (labs), NSF (universities). Strengthening (labs), NSF (universities). Strengthening outreach effortsoutreach efforts

TeraGrid TeraGrid Initially Caltech, Argonne, NCSA, SDSC, Initially Caltech, Argonne, NCSA, SDSC, now expanded. Massive Grid resources.now expanded. Massive Grid resources.

CHEPREO CHEPREO Extending Grids to South America FIU, Extending Grids to South America FIU, Caltech CMS, Brazil Caltech CMS, Brazil

CAIGEE/GAECAIGEE/GAEUltraLightUltraLight

Next Generation Grid and Hybrid Optical Next Generation Grid and Hybrid Optical Network for Data Intensive ResearchNetwork for Data Intensive Research

Grid ProjectsGrid ProjectsPPDG (PI) /SciDACPPDG (PI) /SciDAC

Particle Physics Data Grid. Funded by Particle Physics Data Grid. Funded by DOE in 1999. New funding in 2004-6. DOE in 1999. New funding in 2004-6. Deployment of Grid computing in Deployment of Grid computing in existing HEP experiments. Mainly existing HEP experiments. Mainly physicists.physicists.

GriPhyN/iVDGL (Co-PI) GriPhyN/iVDGL (Co-PI) Grid Physics Network, international Grid Physics Network, international Virtual Data Grid Laboratory. Funded Virtual Data Grid Laboratory. Funded by NSF in 1999. Grid Middleware (VDT, by NSF in 1999. Grid Middleware (VDT, “Virtual Data”), Tier2 deployment and “Virtual Data”), Tier2 deployment and Grid Operations. HENP, Astronomy, Grid Operations. HENP, Astronomy, Gravity Wave physics.Gravity Wave physics.

Open Science Grid Open Science Grid Caltech/FNAL/Brookhaven …Combine Caltech/FNAL/Brookhaven …Combine computing resources at several DOE computing resources at several DOE labs and at dozens of universities to labs and at dozens of universities to effectively become a single national effectively become a single national computing infrastructure for science, computing infrastructure for science, the Open Science Grid.the Open Science Grid.

Others: Others: EU DataGrid, CrossGrid, LHC Computing Grid (LCG), etc.EU DataGrid, CrossGrid, LHC Computing Grid (LCG), etc.

Page 10: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Integrated hybrid (packet-switched + dynamic optical paths) experimental network, Integrated hybrid (packet-switched + dynamic optical paths) experimental network, leveraging Transatlantic R&D network partnerships; leveraging Transatlantic R&D network partnerships; 10 GbE across US and the Atlantic: NLR, DataTAG, TransLight, 10 GbE across US and the Atlantic: NLR, DataTAG, TransLight,

NetherLight, UKLight, etc.; Extensions to Japan, Taiwan, BrazilNetherLight, UKLight, etc.; Extensions to Japan, Taiwan, Brazil End-to-end monitoring; Realtime tracking and optimization; Dynamic bandwidth End-to-end monitoring; Realtime tracking and optimization; Dynamic bandwidth

provisioningprovisioning Agent-based services spanning all layers of the system,Agent-based services spanning all layers of the system,

from the optical cross-connects to the applications.from the optical cross-connects to the applications.

UltraLight Collaboration:http://ultralight.caltech.edu

UltraLight Collaboration:http://ultralight.caltech.edu

Caltech, UF, FIU, UMich, SLAC,FNAL,MIT/Haystack,CERN, UERJ(Rio), NLR, CENIC,UCAID,Translight, UKLight, Netherlight, UvA, UCLondon, KEK, Taiwan

Cisco, Level(3)

Page 11: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

The Grid Analysis Environment (GAE):

“Where the Physics Gets Done”

The Grid Analysis Environment (GAE):

“Where the Physics Gets Done”

Page 12: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Grid Analysis EnvironmentGrid Analysis Environment

The “Acid Test” for Grids; crucial for LHC experimentsThe “Acid Test” for Grids; crucial for LHC experiments Large, Diverse, Distributed Community of usersLarge, Diverse, Distributed Community of users Support for 100s to 1000s of analysis tasks,Support for 100s to 1000s of analysis tasks,

shared among dozens of sitesshared among dozens of sites Widely varying task requirements and prioritiesWidely varying task requirements and priorities Need for Priority Schemes, robust authentication and SecurityNeed for Priority Schemes, robust authentication and Security

Operates in a resource-limited and policy-constrained Operates in a resource-limited and policy-constrained global systemglobal system Dominated by collaboration policy and strategy Dominated by collaboration policy and strategy

(resource usage and priorities)(resource usage and priorities) Requires real-time monitoring; task and workflowRequires real-time monitoring; task and workflow

tracking; decisions often based on a Global system viewtracking; decisions often based on a Global system view

Page 13: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

GAE ArchitectureGAE Architecture Analysis Clients talk

standard protocols to the “Grid Services Web Server”, a.k.a. the Clarens data/services portal.

Simple Web service API allows Analysis Clients (simple or complex) to operate in this architecture.

Typical clients: ROOT, Web Browser, IGUANA, COJAC

The Clarens portal hides the complexity of the Grid

Key features: Global Scheduler, Catalogs, Monitoring, and Grid-wide Execution service.

SchedulerCatalogs

Analysis Client

Grid ServicesWeb Server

ExecutionPriority

Manager

Grid WideExecutionService

DataManagement

Fully-ConcretePlanner

Analysis Client

AnalysisClient

Virtual

Data

Replica

ApplicationsMonitoring

Partially-AbstractPlanner

Metadata

HTTP, SOAP, XML-RPC

MonALISA

Clare

ns

ROOT-

Clare

ns/

Cojac/

IGUANA

BOSS

RefDB/M

OPDB

POOL

ORCA

ROOT

FAMOS

Fully-AbstractPlanner

VDT-Serv

erSupport for Support for the physics the physics analysis and analysis and computing computing modelmodel

Chimera

MCRunjobSphin

x

Page 14: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Structured Peer-to-Peer GAE Architecture

Structured Peer-to-Peer GAE Architecture

The GAE, based on Clarens and Web services, easily allows a “Peer-to-Peer” configuration to be built, with associated robustness and scalability features.

The P2P configuration lows easy creation, use and management of complex VO structures.

Page 15: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Collaboration Analysis Desktop

COJAC (via Web Services)

Grid-Enabled Analysis PrototypesGrid-Enabled Analysis Prototypes

ROOT (via Clarens)

JASOnPDA(via Clarens)

Page 16: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

GAE Integration with CMS and LHC SoftwareGAE Integration with CMS and LHC Software

Clarens Servers:Clarens Servers: Python and Java versions availablePython and Java versions available

Refdb:Refdb: (Stores Job/Task parameters or “cards”) Replica of DC04 (Stores Job/Task parameters or “cards”) Replica of DC04 production details available on Tier2 production details available on Tier2

POOL:POOL: (Persistency framework for LHC) 60 GB POOL file catalog has (Persistency framework for LHC) 60 GB POOL file catalog has been created on Tier2, based on DC04 files. been created on Tier2, based on DC04 files.

MCRunjob/MOP:MCRunjob/MOP: (CMS Job/Task workflow description for batch): (CMS Job/Task workflow description for batch): integration into the Clarens framework, by FNAL/Caltech integration into the Clarens framework, by FNAL/Caltech

BOSS:BOSS: (CMS Job/Task book-keeping system)(CMS Job/Task book-keeping system) INFN INFN is working on the is working on the development of a web service interface to BOSS development of a web service interface to BOSS

SPHINX:SPHINX: Distributed scheduler developed at Distributed scheduler developed at UFLUFL Clarens/MonALISA Integration:Clarens/MonALISA Integration: Facilitating user-level Job/Task Facilitating user-level Job/Task

monitoring: Caltech MURF Summer Student Paul Acciavatti monitoring: Caltech MURF Summer Student Paul Acciavatti CAVES:CAVES: Analysis code-sharing environment developed at Analysis code-sharing environment developed at UFLUFL Core System:Core System: Service Service Auto Discovery, Proxy, Authentication..Auto Discovery, Proxy, Authentication..

R&D of middleware grid services for a distributed data analysis R&D of middleware grid services for a distributed data analysis system: Clarens integrates across CMS/LCG/EGEE and US Grid system: Clarens integrates across CMS/LCG/EGEE and US Grid

SoftwareSoftware

Page 17: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

GAE DeploymentGAE Deployment

20 20 known Clarens deploymentsknown Clarens deployments Caltech Caltech, , FloridaFlorida (9 machines), (9 machines), FermilabFermilab (3), (3), CERNCERN (3), (3), PakistanPakistan (2+2), (2+2), INFNINFN (1) (1)

Installation of Installation of CMS CMS ((ORCAORCA, , COBRACOBRA, , IGUANAIGUANA,…) and ,…) and LCGLCG ((POOLPOOL, , SEALSEAL,…) software on ,…) software on CaltechCaltech GAE testbed for GAE testbed for integration studies. integration studies.

Work with Work with CERNCERN to include the to include the GAEGAE components in the components in the CMSCMS software stack software stack

GAEGAE components being integrated in the US-CMS components being integrated in the US-CMS DPE DPE distributiondistribution

Demonstrated distributed multi user Demonstrated distributed multi user GAEGAE prototype at SC03 prototype at SC03 and elsewhereand elsewhere

Ultimate goal: Ultimate goal: GAE GAE backbone (backbone (ClarensClarens) deployed at all Tier-N ) deployed at all Tier-N facilities. Rich variety of facilities. Rich variety of ClarensClarens web servers offering web servers offering GAEGAE services interfaced with services interfaced with CMSCMS and and LCGLCG software software

CMS core software and grid middleware CMS core software and grid middleware expertiseexpertise

Page 18: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

GAE SummaryGAE Summary Clarens Services-Fabric and Clarens Services-Fabric and “Portal to the Grid”“Portal to the Grid” maturing. maturing.

Numerous servers and clients deployed in CMSNumerous servers and clients deployed in CMSIntegration of Integration of GAEGAE with with MonALISAMonALISA progressing: progressing: A scalable multi-user systemA scalable multi-user system

Joint GAE Joint GAE collaborationscollaborations with with UFloridaUFlorida, , FNALFNAL and and PPDGPPDG “CS11” very productive “CS11” very productive Production work with FNALProduction work with FNALMentoring PostDocs, CS Students, UndergraduatesMentoring PostDocs, CS Students, Undergraduates

Rationalising the new Rationalising the new EGEE ARDA/gLiteEGEE ARDA/gLite work with GAE work with GAE

GAE GAE project descriptionproject description and detailed information: and detailed information: http://ultralight.caltech.edu/gaewebhttp://ultralight.caltech.edu/gaeweb

Page 19: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Tier2:

History and Current Status

Tier2:

History and Current Status

Page 20: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Caltech Tier2 BackgroundCaltech Tier2 Background

The Tier2 Concept Originated at Caltech in Spring 1999The Tier2 Concept Originated at Caltech in Spring 1999 The first Tier2 Prototype was proposed by Caltech The first Tier2 Prototype was proposed by Caltech

together with UCSD in 2000. together with UCSD in 2000. It was designed, commissioned and brought into It was designed, commissioned and brought into

production in 2001. production in 2001.

It Quickly Became a Focal Point Supporting A Variety It Quickly Became a Focal Point Supporting A Variety of Physics Studies and R&D Activities of Physics Studies and R&D Activities

The Proof of concept of the Tier2 and The LHC The Proof of concept of the Tier2 and The LHC Distributed Computing ModelDistributed Computing Model

Service to US CMSService to US CMS and CMS for Analysis+Grid R&D and CMS for Analysis+Grid R&D Production:Production: CMS Data Challenges, Annually from Fall 2000; CMS Data Challenges, Annually from Fall 2000;

HH and and Calibration Studies with CACR, NCSA, etc. Calibration Studies with CACR, NCSA, etc. Grid Testbeds:Grid Testbeds: Development Development Integration Integration Production Production Cluster Hardware & Network DevelopmentCluster Hardware & Network Development

Page 21: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Tier2 – Example Use in 2001Tier2 – Example Use in 2001

Using local Tag event database, user plots event parameters of interest User selects subset of events to be fetched for further analysis Lists of matching events sent to Caltech and San Diego Tier2 servers begin sorting through databases extracting required events For each required event, a new large virtual object is materialized in the server-side cache, that contains all tracks in the event. The database files containing the new objects are sent to client using Globus FTP; client adds them to local cache of large objects The user can now plot event parameters not available in the Tag Future requests take advantage of previously cached large objects in the client

Full Event Database of

~100,000 large objects

Full Event Database of

~40,000 large objects

“Tag” database of ~140,000

small objects

Bandwidth Greedy Grid-enabled Object Collection Analysis for Particle Physics

Julian Bunn, Ian Fisk, Koen Holtman, Harvey Newman, James Patton

RequestRequest

Parallel tuned GSI FTP Parallel tuned GSI FTP

The object of this demo is to show grid-supported interactive physics analysis on a set of 144,000 physics events.

Initially we start out with 144,000 small Tag objects, one for each event, on the Denver client machine. We also have 144,000 LARGE objects, containing full event data, divided over the two tier2 servers.

Page 22: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Current Tier2Current Tier2

Force10 E600

Switch

Caltech-PG

Caltech-DGT

Caltech-Tier2

Winchester RAID5

Cisco 7606Switch/Router

tier2c

Network & MonALISA Servers

Quad Itanium2 Windows 2003

Newisys Quad Opteron Network Server

APC UPS

Dell 5224

Switch

Caltech-Grid3

Network Managem

-ent Server

Page 23: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

2003 Bandwidth Challenge “Data Intensive Distributed Analysis”

2003 Bandwidth Challenge “Data Intensive Distributed Analysis”

SuperComputing 2003 SuperComputing 2003 (Phoenix)(Phoenix)

Multiple filesMultiple files of ~800k simulated of ~800k simulated CMS events, stored CMS events, stored on Clarens on Clarens serversservers at CENIC POP in LA, and at CENIC POP in LA, and TeraGrid node at Caltech.TeraGrid node at Caltech.

TransferredTransferred > 200 files at rates up > 200 files at rates up to 400MB/s to 2 to 400MB/s to 2 disk serversdisk servers at at PhoenixPhoenix

Convert filesConvert files into ROOT format, into ROOT format, then publish via Clarens then publish via Clarens

Data AnalysisData Analysis results displayed results displayed by Clarens ROOT client by Clarens ROOT client

Sustained Rate: 26.2 Gbps (10.0 + 8.2 + 8.0)Sustained Rate: 26.2 Gbps (10.0 + 8.2 + 8.0)

Note: Note: This Bandwidth Challenge subsequently prompted the gift to us This Bandwidth Challenge subsequently prompted the gift to us of a 10G wave to Caltech Campus (and thence the Tier2). of a 10G wave to Caltech Campus (and thence the Tier2).

Page 24: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Caltech Tier2 Users TodayCaltech Tier2 Users Today

InstituteInstitute UsersUsers

CaltechCaltech 3434

CERNCERN 88

UC DavisUC Davis 66

Kyungpook Natl. Kyungpook Natl. Univ., KoreaUniv., Korea

55

Univ. Politechnica, Univ. Politechnica, BucharestBucharest

44

Univ. FloridaUniv. Florida 33

FNALFNAL 22

UC San DiegoUC San Diego 22

Univ. WisconsinUniv. Wisconsin 22

SLACSLAC 22

UC RiversideUC Riverside 11

UCLAUCLA 11

TOTALTOTAL 7070

The Tier2 is supporting scientists The Tier2 is supporting scientists in California and other remote in California and other remote institutesinstitutes

Physics studiesPhysics studies (Caltech, Davis, Riverside, UCLA)(Caltech, Davis, Riverside, UCLA)

Physics productions Physics productions (Caltech, CERN, FNAL, UFlorida)(Caltech, CERN, FNAL, UFlorida)

Network developments and Network developments and measurementsmeasurements (Caltech, CERN, Korea, SLAC)(Caltech, CERN, Korea, SLAC)

MonALISAMonALISA (Caltech, Romania)(Caltech, Romania)

CAIGEE collaborationCAIGEE collaboration (Caltech, Riverside, UCSD, Davis)(Caltech, Riverside, UCSD, Davis)

GAE workGAE work (Caltech, Florida, CERN)(Caltech, Florida, CERN)

Page 25: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

US-CMS ASCB “Tier2 Retreat”US-CMS ASCB “Tier2 Retreat”

Two day “Tier2 Retreat” hosted Two day “Tier2 Retreat” hosted by Caltech by Caltech To Review Progress at the 3 To Review Progress at the 3

Proto-Tier2s; to Update the Proto-Tier2s; to Update the Requirements, and the ConceptRequirements, and the Concept

The start of the procedure by The start of the procedure by which the production US-CMS which the production US-CMS Tier2 Centres will be chosenTier2 Centres will be chosen

Outcomes:Outcomes: Fruitful discussions on Tier2 Fruitful discussions on Tier2

operations, scaling, and role in the operations, scaling, and role in the LHC Distributed Computing ModelLHC Distributed Computing Model

Existing prototypes will become Existing prototypes will become Production Tier2 centres; Production Tier2 centres; recognising their excellent progress recognising their excellent progress and successesand successes

Call for Proposal document being Call for Proposal document being prepared by FNAL: Iowa, Wisconsin, prepared by FNAL: Iowa, Wisconsin, MIT, expected to bidMIT, expected to bid

http://pcbunn.cacr.caltech.edu/Tier2/Tier2-Meeting.htmlhttp://pcbunn.cacr.caltech.edu/Tier2/Tier2-Meeting.html

Page 26: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Physics Channel Studies: Monte Carlo Production

&CMS Calorimetry and Muon

Endcap Software

Physics Channel Studies: Monte Carlo Production

&CMS Calorimetry and Muon

Endcap Software

Page 27: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Themes:Themes: HH and and Gravitons, Gravitons, Calibration; Events for Calibration; Events for DC04 and Physics TDR.DC04 and Physics TDR.

Tier2 & Grid3Tier2 & Grid3 Caltech:Caltech: We have largest We have largest

NRAC award on TeraGrid NRAC award on TeraGrid NCSA Clusters: NCSA Clusters: IA64 Tera-IA64 Tera-

Grid; Platinum IA32; Xeon Grid; Platinum IA32; Xeon SDSC:SDSC: IA64 TeraGrid cluster IA64 TeraGrid cluster NERSC: NERSC: AIX SupercomputerAIX Supercomputer UNM:UNM: Los Lobos cluster Los Lobos cluster

**Madison:Madison: Condor Flock Condor Flock

PYTHIA, CMSIM, PYTHIA, CMSIM, CERNLIB, CERNLIB, GEANT3 – ported to IA64.GEANT3 – ported to IA64.

CMS Events Produced by Caltech for Higgs and Other Analysis

CMS Events Produced by Caltech for Higgs and Other Analysis

ClusterCluster 20002000 20012001 20022002 20032003 2004 2004 [6 [6 Mo.]Mo.]

GeneratedGenerated 33MM 4.24.2BB 5.25.2BB 16.7 16.7 BB 37 37 BB

SimulatedSimulated 33MM 3.33.3MM 2.82.8MM 7.27.2MM 11.211.2MM

Recon- Recon- structed structed

33MM 3.33.3MM 2.82.8MM 7.27.2MM (***)(***)

AnalyzedAnalyzed At At CERNCERN

2.32.3MM 2.32.3MM 1010M M (**)(**)

(***)(***)

(**) Some datasets analyzed more than once(***) To Be Done in Second Half of 2004

Total 2.6 M Node Hours of CPU Time

Page 28: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Calorimetry Core Software(R.Wilkinson, V.Litvin)

Calorimetry Core Software(R.Wilkinson, V.Litvin)

Major “Skeleton Transplant” underway: Major “Skeleton Transplant” underway: Adopting the same “Common Detector” Adopting the same “Common Detector”

core framework as the Tracker & Muon core framework as the Tracker & Muon packagespackages

Use of common detector framework Use of common detector framework facilitates tracking across subdetectorsfacilitates tracking across subdetectors

Common Detector Framework also allowsCommon Detector Framework also allows Realistic grouping of DAQ readoutRealistic grouping of DAQ readout Cell-by-cell simulation on demandCell-by-cell simulation on demand Taking account of MisalignmentsTaking account of Misalignments

Developers:Developers: R.Wilkinson R.Wilkinson (Caltech)(Caltech) : Lead design + : Lead design +

HCALHCAL V. Litvin V. Litvin (Caltech)(Caltech) preshower preshower A. HolznerA. Holzner (Zurich-ETH)(Zurich-ETH) : :

geometry + ECAL barrel geometry + ECAL barrel H. NealH. Neal (Yale):(Yale): ECAL endcap ECAL endcap

Page 29: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Endcap Muon Slice Test DAQ SoftwareEndcap Muon Slice Test DAQ Software

Software developed by R. Wilkinson in Software developed by R. Wilkinson in 2003-2004 now adopted by the chamber 2003-2004 now adopted by the chamber communitycommunity Handles VME configuration Handles VME configuration

and control for five kinds of boardsand control for five kinds of boards Builds events from multiple Builds events from multiple

data & trigger readout PCsdata & trigger readout PCs Supports real-time data Supports real-time data

monitoring monitoring Unpacks dataUnpacks data Packs simulated data into Packs simulated data into

raw format raw format Ran successfully at the Ran successfully at the

May/June 2004 Test BeamMay/June 2004 Test Beam Now supported by Rice U. Now supported by Rice U.

(Padley, Tumanov, Geurts)(Padley, Tumanov, Geurts) Challenges ahead for Sept. Challenges ahead for Sept.

EMU/RPC/HCAL testsEMU/RPC/HCAL tests

Builds on Expertise in EMU Software, Builds on Expertise in EMU Software, Reconstruction and Trigger; and Reconstruction and Trigger; and

Large Body of Work in 2001-3Large Body of Work in 2001-3

Page 30: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Summary & Conclusions (1)Summary & Conclusions (1) Caltech Originated & Led CMS and US CMS Software & Computing Caltech Originated & Led CMS and US CMS Software & Computing in the Early Stages (1996-2001) in the Early Stages (1996-2001)

Carries Over Into Diverse Lead-Development Roles TodayCarries Over Into Diverse Lead-Development Roles Today Leverages strong Grid-related support from DOE/MICS and NSF/CISELeverages strong Grid-related support from DOE/MICS and NSF/CISE

The Tiered “Data Grid Hierarchy” Originated at Caltech is the The Tiered “Data Grid Hierarchy” Originated at Caltech is the Basis of LHC Computing Basis of LHC Computing

Tier 0/1/2 roles, interactions and task/data movements increasingly Tier 0/1/2 roles, interactions and task/data movements increasingly well-understood, through data challenges on increasing scaleswell-understood, through data challenges on increasing scales

Caltech’s Tier2 is now a well-developed multi-faceted production and R&D Caltech’s Tier2 is now a well-developed multi-faceted production and R&D facility for US CMS physicists and computer scientists collaborating on facility for US CMS physicists and computer scientists collaborating on LHC physics analysis, Grid projects, and networking advancesLHC physics analysis, Grid projects, and networking advances

Grid projects (PPDG, GriPhyN, iVDGL, Grid3, UltraLight etc.) are Grid projects (PPDG, GriPhyN, iVDGL, Grid3, UltraLight etc.) are successfully providing infrastructure, expertise, and collaborative effortsuccessfully providing infrastructure, expertise, and collaborative effort

CMS Core Computing and Software (CCS), Physics Reconstruction and CMS Core Computing and Software (CCS), Physics Reconstruction and Selection (PRS) Groups: Excellent ProgressSelection (PRS) Groups: Excellent Progress

Major confidence boost with DC04 – managed productions peaking at Major confidence boost with DC04 – managed productions peaking at rates > 25Hz; Caltech Tier2 and MonALISA Took Part rates > 25Hz; Caltech Tier2 and MonALISA Took Part

Page 31: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Summary & Conclusions (2)Summary & Conclusions (2)Caltech leading the Grid Analysis Environment work for CMSCaltech leading the Grid Analysis Environment work for CMS

Clarens Web-services fabric and “Portal to the Grid” MaturingClarens Web-services fabric and “Portal to the Grid” Maturing Joint work with UFlorida, CERN, FNAL, PPDG/CS11 on system Joint work with UFlorida, CERN, FNAL, PPDG/CS11 on system

architecture, tools, and productionsarchitecture, tools, and productions Numerous demonstrators and hardened GAE tools and clients Numerous demonstrators and hardened GAE tools and clients

implemented, and serving US CMS and CMS Needsimplemented, and serving US CMS and CMS Needs Integration with MonALISA to flesh out the Distributed System Services Integration with MonALISA to flesh out the Distributed System Services

Architecture need for a managed Grid on a global scale Architecture need for a managed Grid on a global scale Groundbreaking work on Core CMS Software and Simulation StudiesGroundbreaking work on Core CMS Software and Simulation Studies

HCAL/ECAL/Preshower common software foundation, and HCAL/ECAL/Preshower common software foundation, and EMU DAQ software for CMS Slice Test: developments led by CaltechEMU DAQ software for CMS Slice Test: developments led by Caltech

Higgs channel background simulations at very large scale, leveraging Higgs channel background simulations at very large scale, leveraging global resources opportunisticallyglobal resources opportunistically

Enables Enables US CMS Lead Role in H US CMS Lead Role in H Analysis: Caltech + UCSD Analysis: Caltech + UCSD

Page 32: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

ExtrasExtras

Page 33: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

GAE Recent PrototypesGAE Recent Prototypes

Clarens BOSS InterfaceClarens BOSS Interface

Clarens Remote File Clarens Remote File AccessAccess

Clarens POOL InterfaceClarens POOL Interface

Clarens Virtual Clarens Virtual Organization Organization ManagementManagement

Page 34: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

(**) Some datasets were reanalyzed more than once(***) To Be Done in Second Half of 2004

ClusterCluster 20002000 20012001 20022002 20032003 20042004UW-MadisonUW-Madison 204 204 kk 124 124 kk ---- 262 262 kk 319 319 kk

Platinum Platinum NCSA NCSA

n/an/a 0.6 0.6 kk ---- 515 515 kk 434 434 kk

LosLosLobosLobos

-- -- 33 33 kk 3 3 kk ---- ----

Tungsten Tungsten NCSANCSA

n/an/a n/an/a n/an/a n/an/a 279 279 kk

Tier2 CITTier2 CIT n/an/a 5050kk 200200kk 300300kk 2525kkTG TotalTG Total n/an/a n/an/a n/an/a n/an/a 199 199 kk

Total Node Total Node HoursHours

204 204 kk 208 208 kk 203203kk 1077 1077 kk 969969kk

[6 Mo.][6 Mo.]

Event Generation, Simulation and Reconstruction for CMS – Caltech Tasks

Event Generation, Simulation and Reconstruction for CMS – Caltech Tasks

Page 35: CMS and LHC Software and Computing  The Caltech Tier2 and Grid Enabled Analysis

Grid Enabled Analysis: User View of a Collaborative Desktop

Grid Enabled Analysis: User View of a Collaborative Desktop

Physics analysis requires varying levels of interactivity, Physics analysis requires varying levels of interactivity, from “instantaneous response” to “background” to “batch mode”from “instantaneous response” to “background” to “batch mode”

Requires adapting the classical Grid “batch-oriented” view to a Requires adapting the classical Grid “batch-oriented” view to a services-oriented view, with tasks monitored and trackedservices-oriented view, with tasks monitored and tracked

Use Web Services, leveraging wide availability of commodity tools Use Web Services, leveraging wide availability of commodity tools and protocols: adaptable to a variety of platforms and protocols: adaptable to a variety of platforms

Implement the Clarens Web Services layer as mediator between Implement the Clarens Web Services layer as mediator between authenticated clients and services as part of CAIGEE architectureauthenticated clients and services as part of CAIGEE architecture

Clarens presents a consistent analysis environment to users, based on Clarens presents a consistent analysis environment to users, based on WSDL/SOAP or XML RPCs, with PKI-based authentication for SecurityWSDL/SOAP or XML RPCs, with PKI-based authentication for Security

PDAPDAROOTROOT

ClarensClarensClarensClarens

External Services

MonaLisaMonaLisaBrowserBrowserIguanaIguana

VO

Management

Authentication Authorization Logging Key Escrow

File Access

Shell

Storage Resource

Broker

CMS ORCA/COBRA

Cluster Schedulers

ATLAS DIAL

Griphyn VDT

MonaLisa Monitoring