37
The ATLAS Computing Model: Status, Plans and Future Possibilities Shawn McKee Shawn McKee University of Michigan University of Michigan CCP 2006, Gyeongju, CCP 2006, Gyeongju, Korea Korea August 29 August 29 th th , 2006 , 2006

The ATLAS Computing Model: Status, Plans and Future Possibilities Shawn McKee University of Michigan CCP 2006, Gyeongju, Korea August 29 th, 2006

Embed Size (px)

Citation preview

The ATLAS Computing Model:

Status, Plans and Future

Possibilities

Shawn McKeeShawn McKee

University of MichiganUniversity of Michigan

CCP 2006, Gyeongju, CCP 2006, Gyeongju,

Korea Korea

August 29August 29thth, 2006, 2006

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 2

Overview

The ATLAS collaboration has only a year The ATLAS collaboration has only a year

before it must manage large amounts of before it must manage large amounts of

“real” data for its globally “real” data for its globally

distributed collaboration.distributed collaboration.

ATLAS physicists need the software and ATLAS physicists need the software and

physical infrastructure required to:physical infrastructure required to: Calibrate and align detector subsystems

to produce well understood data Realistically simulate the ATLAS detector

and its underlying physics Provide access to ATLAS data globally Define, manage, search and analyze data-

sets of interest

I will cover current status, plans and I will cover current status, plans and

some of the relevant research in this some of the relevant research in this

area and indicate how it might benefit area and indicate how it might benefit

ATLAS in augmenting and extending its ATLAS in augmenting and extending its

infrastructure.infrastructure.

ATLASATLAS

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 3

The ATLAS Computing Model

Computing Model is fairly well evolved, documented in C-TDRComputing Model is fairly well evolved, documented in C-TDR http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/publi

c/lhcc-2005-022.pdf

There are many areas with significant questions/issues to be There are many areas with significant questions/issues to be

resolved:resolved: Calibration and alignment strategy is still evolving

Physics data access patterns MAY be exercised (SC04: since June) Unlikely to know the real patterns until 2007/2008!

Still uncertainties on the event sizes , reconstruction time

How best to integrate ongoing “infrastructure” improvements from research efforts into our operating model?

Lesson from the previous round of experiments at CERN (LEP, 1989-Lesson from the previous round of experiments at CERN (LEP, 1989-

2000)2000)

Reviews in 1988 underestimated the computing requirements by an order of magnitude!

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 4

ATLAS Computing Model Overview

We have a hierarchical model (EF-T0-T1-T2) with We have a hierarchical model (EF-T0-T1-T2) with

specific roles and responsibilitiesspecific roles and responsibilities Data will be processed in stages: RAW->ESD->AOD-TAG

Data “production” is well-defined and scheduled

Roles and responsibilities are assigned within the hierarchy.

Users will send jobs to the data and extract Users will send jobs to the data and extract relevant datarelevant data typically NTuples or similar

Goal is a production and analysis system with Goal is a production and analysis system with seamless access to all ATLAS grid resourcesseamless access to all ATLAS grid resources

All resources need to be managed effectively to All resources need to be managed effectively to insure ATLAS goals are met and resource providers insure ATLAS goals are met and resource providers policy’s are enforced. Grid middleware must policy’s are enforced. Grid middleware must provide thisprovide this

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 5

ATLAS Facilities and Roles

Event Filter Farm at CERN Event Filter Farm at CERN Assembles data (at CERN) into a stream to the Tier 0 Center

Tier 0 Center at CERNTier 0 Center at CERN Data archiving: Raw data to mass storage at CERN and to Tier 1 centers Production: Fast production of Event Summary Data (ESD) and Analysis Object

Data (AOD) Distribution: ESD, AOD to Tier 1 centers and mass storage at CERN

Tier 1 Centers distributed worldwide (10 centers)Tier 1 Centers distributed worldwide (10 centers) Data steward: Re-reconstruction of raw data they archive, producing new ESD,

AOD Coordinated access to full ESD and AOD (all AOD, 20-100% of ESD depending upon

site)

Tier 2 Centers distributed worldwide (approximately 30 centers)Tier 2 Centers distributed worldwide (approximately 30 centers) Monte Carlo Simulation, producing ESD, AOD, ESD, AOD sent to Tier 1 centers On demand user physics analysis of shared datasets

Tier 3 Centers distributed worldwideTier 3 Centers distributed worldwide Physics analysis

A CERN Analysis FacilityA CERN Analysis Facility Analysis

Enhanced access to ESD and RAW/calibration data on demand

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 6

Computing Model: event data flow from EF

Events written in “ByteStream” format by the Event Filter farm in Events written in “ByteStream” format by the Event Filter farm in

2 GB files 2 GB files ~1000 events/file (nominal size is 1.6 MB/event)

200 Hz trigger rate (independent of luminosity)

Currently 4+ streams are foreseen: Express stream with “most interesting” events Calibration events (including some physics streams, such as inclusive leptons)

“Trouble maker” events (for debugging) Full (undivided) event stream

One 2-GB file every 5 seconds will be available from the Event Filter

Data will be transferred to the Tier-0 input buffer at 320 MB/s (average)

The Tier-0 input buffer will have to hold raw data waiting for The Tier-0 input buffer will have to hold raw data waiting for

processingprocessing And also cope with possible backlogs

~125 TB will be sufficient to hold 5 days of raw data on disk

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 7

ATLAS Data Processing

Tier-0:Tier-0: Prompt first pass processing on express/calibration & physics streams

24-48 hours, process full physics streams with reasonable calibrations

Implies large data movement from T0 →T1s, some T0 ↔ T2 (Calibration)Tier-1:Tier-1:

Reprocess 1-2 months after arrival with better calibrationsReprocess all local RAW at year end with improved calibration and software

Implies large data movement from T1↔T1 and T1 → T2

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 8

ATLAS partial &“average” T1 Data Flow (2008)

Tier-0

CPUfarm

T1T1OtherTier-1s

diskbuffer

RAW

1.6 GB/file0.02 Hz1.7K f/day32 MB/s2.7 TB/day

ESD2

0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day

AOD2

10 MB/file0.2 Hz17K f/day2 MB/s0.16 TB/day

AODm2

500 MB/file0.004 Hz0.34K f/day2 MB/s0.16 TB/day

RAW

ESD2

AODm2

0.044 Hz3.74K f/day44 MB/s3.66 TB/day

RAW

ESD (2x)

AODm (10x)

1 Hz85K f/day720 MB/s

T1T1OtherTier-1s

T1T1EachTier-2

Tape

RAW

1.6 GB/file0.02 Hz1.7K f/day32 MB/s2.7 TB/day

diskstorage

AODm2

500 MB/file0.004 Hz0.34K f/day2 MB/s0.16 TB/day

ESD2

0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day

AOD2

10 MB/file0.2 Hz17K f/day2 MB/s0.16 TB/day

ESD2

0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day

AODm2

500 MB/file0.036 Hz3.1K f/day18 MB/s1.44 TB/day

ESD2

0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day

AODm2

500 MB/file0.036 Hz3.1K f/day18 MB/s1.44 TB/day

ESD1

0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day

AODm1

500 MB/file0.04 Hz3.4K f/day20 MB/s1.6 TB/day

AODm1

500 MB/file0.04 Hz3.4K f/day20 MB/s1.6 TB/day

AODm2

500 MB/file0.04 Hz3.4K f/day20 MB/s1.6 TB/day

PlusPlus simulation and simulation and analysis data flowanalysis data flow

Slide from D.Barberis

There are a significant There are a significant number of flows to be number of flows to be managed and optimizedmanaged and optimized

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 9

ATLAS Event Data Model

RAW:RAW: “ByteStream” format, ~1.6 MB/event

ESD (Event Summary Data):ESD (Event Summary Data): Full output of reconstruction in object (POOL/ROOT) format:

Tracks (+ their hits), Calo Clusters, Calo Cells, combined reconstruction objects etc.

Nominal size 500 kB/event currently 2.5 times larger: contents and technology under revision

AOD (Analysis Object Data):AOD (Analysis Object Data): Summary of event reconstruction with “physics” (POOL/ROOT)

objects: electrons, muons, jets, etc.

Nominal size 100 kB/event currently 70% of that: contents and technology under revision

TAG:TAG: Database used to quickly select events in AOD and/or ESD

files

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 10

ATLAS Data Streaming

ATLAS Computing TDR had 4 streams from ATLAS Computing TDR had 4 streams from event filter event filter primary physics, calibration, express, problem events Calibration stream has split at least once since!

Discussions are focused upon optimisation of data Discussions are focused upon optimisation of data

accessaccess

At AOD, envisage ~10 streamsAt AOD, envisage ~10 streams

TAGs useful for event selection and data set TAGs useful for event selection and data set

definitiondefinition

We are now planning ESD and RAW streamingWe are now planning ESD and RAW streaming Straw man streaming schemes (trigger based) being agreed Will explore the access improvements in large-scale exercises

Are also looking at overlaps, bookkeeping etc

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 11

HEP Data Analysis

Raw data Raw data hits, pulse heights

Reconstructed data (ESD)Reconstructed data (ESD) tracks, clusters…

Analysis Objects (AOD)Analysis Objects (AOD) Physics Objects Summarized Organized by physics topic

Ntuples, histograms, Ntuples, histograms, statistical datastatistical data

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 12

Production Data

Processing

Raw dataRaw data

Reconstruction

Data Acquisition

Level 3 trigger

Trigger TagsTrigger Tags

Event Summary Data ESDEvent Summary Data ESD Event Tags Event Tags

Physics Models

Monte Carlo Truth DataMonte Carlo Truth Data

MC Raw DataMC Raw Data

Reconstruction

MC Event Summary DataMC Event Summary Data MC Event Tags MC Event Tags

Detector Simulation

Calibration DataCalibration Data

Run ConditionsRun Conditions

Trigger System

coordination required at the collaboration and group levels

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 13

Physics Analysis

Event Tags Event TagsEvent Selection

Calibration DataCalibration Data

Analysis

ProcessingRaw DataRaw Data

Tier 0,1Collaboration

wide

Tier 2Analysis

Groups

Tier 3, 4Physicists

Physics Analysis

PhysicsObjects

StatObjects

ESDESD

ESDESD

ESD

Analysis

Objects

PhysicsObjects

StatObjects

PhysicsObjects

StatObjects

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 14

ATLAS Resource Requirements in for 2008

Recent (July 2006) updates have reduced the expected Recent (July 2006) updates have reduced the expected contributionscontributionsCPU (MSI2k) Tape (PB) Disk (PB)

Tier-0 3.7 2.1 0.2

CERN AF 2.1 0.3 1.0

Sum of Tier-1s 16.7 6.0 7.6

Sum of Tier-2s 18.9 0.0 6.1

Total 41.4 8.4 14.9

Computing TDRComputing TDR

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 15

ATLAS Grid Infrastructure

ATLAS plans to use grid technologyATLAS plans to use grid technology To meet its resource needs To manage those resources

Three gridsThree grids LCG Nordugrid OSG

Significant resources, but different middlewareSignificant resources, but different middleware Teams working on solutions are typically associated to a

grid and its middleware

In principle all ATLAS resources are available to all In principle all ATLAS resources are available to all ATLAS usersATLAS users

Works out to O(1) cpu per user Interest by ATLAS users to use their local systems with

priority Not only a central system, flexibility concerning middleware

Plan “A” is “the Grid”…there is no Plan “A” is “the Grid”…there is no plan “B”plan “B”

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 16

ATLAS Virtual Organization

Until recently the Grid has been a “free for all”Until recently the Grid has been a “free for all” no CPU or storage accounting (new in a prototyping/testing phase) no or limited priorities (roles mapped to small number of accounts:

atlas01-04) no storage space reservation

Last year ATLAS saw a competition for resources between “official” Last year ATLAS saw a competition for resources between “official” Rome productions and “unofficial”, but organized, productionsRome productions and “unofficial”, but organized, productions B-physics, flavour tagging...

The latest release of the VOMS (VirThe latest release of the VOMS (Virttual Organisation Management ual Organisation Management Service) middleware package allows the definition of user groups Service) middleware package allows the definition of user groups and roles within the ATLAS Virtual Organisation and roles within the ATLAS Virtual Organisation and is used by all ATLAS grid flavors!

Relative priorities are easy to enforce IF all jobs go through the Relative priorities are easy to enforce IF all jobs go through the same systemsame system

For a distributed submission system, it is up to the resource For a distributed submission system, it is up to the resource providers to:providers to: agree to the policies of each site with ATLAS publish and enforce the agreed policies

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 17

Calibrating and Aligning ATLAS

Calibrating and aligning detector subsystems is a critical processCalibrating and aligning detector subsystems is a critical process Without well understood detectors we will have no meaningful

physics data

The default option for offline prompt calibrations is processing at The default option for offline prompt calibrations is processing at Tier-0 or at the Cern Analysis Facility, however the TDR states Tier-0 or at the Cern Analysis Facility, however the TDR states that: that: “Tier-2 centres will provide analysis facilities, and some will provide

the capacity to produce calibrations based on processing raw data”. “Tier-2 facilities may take a range of significant roles in ATLAS such

as providing calibration constants, simulation and analysis”. “Some Tier-2s may take significant role in calibration following the

local detector interests and involvements”.

ATLAS will have some subsystems utilizing Tier-2 centers as ATLAS will have some subsystems utilizing Tier-2 centers as Calibration and Alignment sites.Calibration and Alignment sites. Must insure we can support the data flow without disrupting other

planned flows Real-time aspect is critical – the system must account for “deadlines”

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 18

L2PUL2PU

ThreadThread

Thread Thread

ThreadThread

Calibration

Server

Local

Server

Local

Server

Local

Server

GathererGathererCalibration

farm

disk

Server

1

2 3

4

6

Control

Network

x 25x 25

x x ~~2020

5

=Thread

~ ~ 10 MB/s10 MB/s

TCP/IP, UDP, etc.

~ ~ 500 kB/s500 kB/s

~ ~ 500 kB/s500 kB/s

DequeueMemoryqueue

Proposed ATLAS Muon Calibration System

(quoted bandwidths are for 10 KHz muon rate)

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 19

ATLAS Simulations

Within ATLAS the Tier-2 centers will be responsible for Within ATLAS the Tier-2 centers will be responsible for

the bulk of the simulation effort.the bulk of the simulation effort.

Current planning assumes ATLAS will simulate approximately Current planning assumes ATLAS will simulate approximately

20% of the real data volume20% of the real data volume This number is dictated by resources; ATLAS may need to find a

way to increase this fraction

Event generator frame work Event generator frame work

interfaces interfaces

multiple packagesmultiple packagesincluding the Genser distribution provided by LCG-AA

Simulation with Geant4 since Simulation with Geant4 since

early 2004early 2004automatic geometry build from GeoModel>25M events fully simulated up to now since mid-2004

only a handful of crashes!

Digitization tested and tuned Digitization tested and tuned

with Test Beamwith Test Beam

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 20

ATLAS Analysis Computing Model

ATLAS Analysis model broken into ATLAS Analysis model broken into two componentstwo components

Scheduled central productionScheduled central production of augmented of augmented

AOD, tuples & TAG collections from ESDAOD, tuples & TAG collections from ESD Derived files moved to other T1s and to T2s

Chaotic user analysisChaotic user analysis of augmented AOD of augmented AOD

streams, tuples, new selections etc and streams, tuples, new selections etc and

individual user simulation and CPU-bound individual user simulation and CPU-bound

tasks matching the official MC productiontasks matching the official MC production Modest to large(?) job traffic between T2s (and T1s, T3s)

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 21

Distributed Analysis

At this point emphasis is on a batch model to At this point emphasis is on a batch model to implement the ATLAS Computing modelimplement the ATLAS Computing model Interactive solutions are difficult to realize on top of the current middleware layer

We expect ATLAS users to send large batches of We expect ATLAS users to send large batches of short jobs to optimize their turnaroundshort jobs to optimize their turnaround Scalability Data Access

Analysis in parallel to productionAnalysis in parallel to production Job Priorities

Distributed analysis effectiveness depends Distributed analysis effectiveness depends strongly upon the hardware and software strongly upon the hardware and software infrastructure. infrastructure.

Analysis is divided into “group” and “on Analysis is divided into “group” and “on demand” typesdemand” types

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 22

ATLAS Group Analysis

Group analysis is characterised by access to full ESD and Group analysis is characterised by access to full ESD and

perhaps RAW dataperhaps RAW data This is resource intensive Must be a scheduled activity Can back-navigate from AOD to ESD at same site Can harvest small samples of ESD (and some RAW) to be sent to Tier

2s Must be agreed by physics and detector groups

Group analysis will produceGroup analysis will produce Deep copies of subsets Dataset definitions TAG selections

Big TrainsBig Trains Most efficient access if analyses are blocked into a ‘big train’ Idea around for a while, already used in e.g. heavy ions

Each wagon (group) has a wagon master=production manager Must ensure will not derail the train

Train must run often enough (every ~2 weeks?)

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 23

ATLAS On-demand Analysis

Restricted Tier 2s and CAFRestricted Tier 2s and CAF Could specialize some Tier 2s for some groups ALL Tier 2s are for ATLAS-wide usage

Role and group based quotas are essentialRole and group based quotas are essential Quotas to be determined per group not per user

Data Selection Data Selection Over small samples with Tier-2 file-based TAG and AMI dataset

selector TAG queries over larger samples by batch job to database TAG at

Tier-1s/large Tier 2s

What data?What data? Group-derived EventViews Root Trees Subsets of ESD and RAW

Pre-selected or selected via a Big Train run by working group

Each user needs 14.5 kSI2k (about 12 current boxes)Each user needs 14.5 kSI2k (about 12 current boxes)

2.1TB ‘associated’ with each user on average2.1TB ‘associated’ with each user on average

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 24

ATLAS Data Management

Based on DatasetsBased on Datasets PoolFileCatalog API is used to hide grid PoolFileCatalog API is used to hide grid differences differences On LCG, LFC acts as local replica catalog Aims to provide uniform access to data on all grids

FTS is used to transfer data between the sitesFTS is used to transfer data between the sites To date FTS has tried to manage data flow by restricting allowed endpoints (“channel” definition)

Interesting possibilities exist to incorporate network related research advances to improve performance, efficiency and reliability

Data management is a central aspect of Data management is a central aspect of Distributed AnalysisDistributed Analysis PANDA is closely integrated with DDM and operational LCG instance was closely coupled with SC3 Right now we run a smaller instance for test purposes Final production version will be based on new middleware for SC4 (FPS)

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 25

Distributed Data Management Accessing distributed data on the Grid is not a simple task Accessing distributed data on the Grid is not a simple task

(see below!)(see below!)

Several DBs are needed centrally to hold dataset informationSeveral DBs are needed centrally to hold dataset information

““Local” catalogues hold information on local data storageLocal” catalogues hold information on local data storage

The new DDM systemThe new DDM system

(right) is under test (right) is under test

this summerthis summer

It will be usedIt will be used

for all ATLAS datafor all ATLAS data

from October onfrom October on

(LCG Service(LCG Service

Challenge 3)Challenge 3)

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 26

ATLAS plans for using FTS

T1

T0

T2T2

LFC

LFC

FTS Server T1

FTS Server T0

T1

….

VO box

VO box

LFC: local within ‘cloud’

All SEs SRM

Tier-0 FTS server:Tier-0 FTS server: Channel from Tier-0 to all Tier-1s: used to move "Tier-0" (raw and 1st pass reconstruction data) Channel from Tier-1s to Tier-0/CAF: to move e.g. AOD (CAF also acts as "Tier-2" for analysis)

Tier-1 FTS server:Tier-1 FTS server: Channel from all other Tier-1s to this Tier-1 (pulling data): used for DQ2 dataset subscriptions (e.g. reprocessing, or massive "organized" movement when doing Distributed Production) Channel to and from this Tier-1 to all its associated Tier-2s

Association defined by ATLAS management (along with LCG)

“Star”-channel for all remaining traffic [new: low-traffic]

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 27

ATLAS and Related Research

Up to now I have focused on the ATLAS computing modelUp to now I have focused on the ATLAS computing model

Implicit in this model and Implicit in this model and central to its successcentral to its success are: are: High-performance, ubiquitous and robust networks Grid middleware to securely find, prioritize and manage resources

Without either of these capabilities the model risks melting Without either of these capabilities the model risks melting

down or failing to deliver the required capabilities.down or failing to deliver the required capabilities.

Efforts to date have (Efforts to date have (necessarilynecessarily) focused on building the ) focused on building the

most basic capabilities and demonstrating they can work.most basic capabilities and demonstrating they can work.

To be truly effectiveTo be truly effective will require updating and extending this will require updating and extending this

model to include the best results of ongoing networking and model to include the best results of ongoing networking and

resource management research projects.resource management research projects.

A quick overview of some selected (US) projects follows…A quick overview of some selected (US) projects follows…

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 28

The UltraLight Project UltraLight isUltraLight is

A four year $2M NSF ITR funded by MPS (2005-8)

Application driven Network R&D.

A collaboration of BNL, Buffalo, Caltech, CERN, Florida, FIU, FNAL, Internet2, Michigan, MIT, SLAC, Vanderbilt.

Significant international participation: Brazil, Japan, Korea amongst many others.

Goal:Goal: Enable the network as a managed resource. Enable the network as a managed resource.

Meta-Goal:Meta-Goal: Enable physics analysis and discoveries which could Enable physics analysis and discoveries which could

not otherwise be achieved.not otherwise be achieved.

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 29

ATLAS and UltraLight Disk-to-Disk Research

ATLAS MDT sub-systems need very fast calibration turn-around time (< 24 hours)

Initial estimates plan for as much as 0.5 TB/day of high-Pt muon data for calibration.

UltraLightUltraLight could enable us to quickly transport (~1/4 hour) the needed events to Tier-2 sites for calibration

Michigan is an ATLAS Muon Alignment and Calibration Center, a Tier-2 and an UltraLight Site

Muon calibration work has presented an opportunity to couple research efforts into production

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 30

Networking at KNU (Korea)

Uses 10Gbps GLORIAD link Uses 10Gbps GLORIAD link

from Korea to US, which is from Korea to US, which is

called BIG-GLORIAD, also called BIG-GLORIAD, also

part of UltraLightpart of UltraLight

Try to saturate this BIG-Try to saturate this BIG-

GLORIAD link with servers GLORIAD link with servers

and cluster storages and cluster storages

connected with 10Gbps connected with 10Gbps

Korea is planning to Korea is planning to

be a Tier-1 site for be a Tier-1 site for

LHC experimentsLHC experiments

KoreaU.S.

BIG-GLORIAD

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 31

VINCI: Virtual Intelligent Networks

for Computing Infrastructures A network Global Scheduler

implemented as a set of collaborating agents running on distributed MonALISA services

Each agent uses policy-based priority queues; and negotiates for an end to end connection using a set of cost functions

A lease mechanism is implemented for each offer an agent makes to its peers

Periodic lease renewal is used for all agents; this results in a flexible response to task completion, as well as to application failure or network errors

If network errors are detected, supervising agents cause all segments to be released along a path. An alternative path may then be

set up rapidly enough to avoid a TCP timeout, allowing the transfer to continue uninterrupted.

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 32

Lambda Station

A network path forwarding service to interface production A network path forwarding service to interface production facilities with advanced research networks: facilities with advanced research networks:

Goal is selective forwarding on a per flow basis Alternate network paths for high impact data movement

Dynamic path modification, with graceful cutover & fallback

Current implementation is based on policy-based routing

& DSCP marking

Lambda Station interacts with:Lambda Station interacts with: Host applications & systems LAN infrastructure Site border infrastructure Advanced technology WANs Remote Lambda Stations

D. Petravick, P. DeMar

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 33

TeraPaths (LAN QoS Integration)

Site A Site BWAN

web services web services

WAN monitoring

WAN web services

hardware drivershardware drivers

Web page

APIs

Cmd line

QoS requests

user manager

scheduler

site monitor

router manager

user manager

scheduler

site monitor

router manager

The TeraPaths project investigates the integration The TeraPaths project investigates the integration and use of LAN QoS and MPLS/GMPLS-based and use of LAN QoS and MPLS/GMPLS-based differentiated network services in the ATLAS data differentiated network services in the ATLAS data intensive distributed computing environment in order intensive distributed computing environment in order to manage the network as a to manage the network as a critical resourcecritical resource

TeraPaths TeraPaths Includes:Includes:

BNLBNL

MichiganMichigan

ESNet (OSCARS) ESNet (OSCARS)

FNAL(LambdaStatioFNAL(LambdaStation)n)

SLAC(DWMI)SLAC(DWMI)

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 34

Integrating Research into Production

As you can see there are many efforts, even just As you can see there are many efforts, even just

within the US, to help integrate a managed within the US, to help integrate a managed

network into our infrastructurenetwork into our infrastructure

There are also many similar efforts in computing, There are also many similar efforts in computing,

storage, grid-middleware and applications (EGEE, storage, grid-middleware and applications (EGEE,

OSG, LCG,…). OSG, LCG,…).

The challenge will be to harvest these efforts The challenge will be to harvest these efforts

and integrate them into a robust system for LHC and integrate them into a robust system for LHC

physicists.physicists.

I will close with an “example” vision of what I will close with an “example” vision of what

could result from such integration…could result from such integration…

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 35

An Example: UltraLight/ATLAS Application (2008)

Node1> fts –vvv –in mercury.ultralight.org:/data01/big/zmumu05687.root –Node1> fts –vvv –in mercury.ultralight.org:/data01/big/zmumu05687.root –out venus.ultralight.org:/mstore/events/data –prio 3 –deadline +2:50 –xsumout venus.ultralight.org:/mstore/events/data –prio 3 –deadline +2:50 –xsum

FTS: Initiating file transfer setup…FTS: Initiating file transfer setup… FTS: Remote host responds readyFTS: Remote host responds ready FTS: Contacting path discovery serviceFTS: Contacting path discovery service PDS: Path discovery in progress…PDS: Path discovery in progress… PDS:PDS: Path RTT 128.4 ms, best effort path bottleneck is 10 GEPath RTT 128.4 ms, best effort path bottleneck is 10 GE PDS:PDS: Path optionsPath options found: found: PDS:PDS: LightpathLightpath option exists end-to-end option exists end-to-end PDS:PDS: Virtual pipeVirtual pipe option exists (partial) option exists (partial) PDS:PDS: High-performance protocolHigh-performance protocol capable end-systems capable end-systems

existexist FTS: Requested transfer 1.2 TB file transfer within 2 hours 50 FTS: Requested transfer 1.2 TB file transfer within 2 hours 50

minutes, priority 3minutes, priority 3 FTS: Remote host confirms available space for FTS: Remote host confirms available space for DN=DN=

[email protected]@ultralight.org FTS: End-host agent contacted…parameters transferredFTS: End-host agent contacted…parameters transferred EHA: Priority 3 request allowed for EHA: Priority 3 request allowed for [email protected]@ultralight.org EHA: request scheduling detailsEHA: request scheduling details EHA: EHA: Lightpath prior scheduling (higher/same priority) precludes Lightpath prior scheduling (higher/same priority) precludes

useuse EHA: Virtual pipe sizeable to 3 Gbps available for 1 hour EHA: Virtual pipe sizeable to 3 Gbps available for 1 hour

starting in 52.4 minutesstarting in 52.4 minutes EHA: request monitoring prediction along pathEHA: request monitoring prediction along path EHA: FAST-UL transfer expected to deliver 1.2 Gbps (+0.8/-0.4) EHA: FAST-UL transfer expected to deliver 1.2 Gbps (+0.8/-0.4)

averaged over next 2 hours 50 minutesaveraged over next 2 hours 50 minutes

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 36

ATLAS FTS 2008 Example (cont.) EHA: Virtual pipe (partial) expected to deliver 3 Gbps(+0/-0.3) EHA: Virtual pipe (partial) expected to deliver 3 Gbps(+0/-0.3)

during reservation; variance from unprotected section < 0.3 Gbps during reservation; variance from unprotected section < 0.3 Gbps 95%CL95%CL

EHA: Recommendation: begin transfer using FAST-UL using network EHA: Recommendation: begin transfer using FAST-UL using network identifier #5A-3C1. Connection will migrate to MPLS/QoS tunnel in identifier #5A-3C1. Connection will migrate to MPLS/QoS tunnel in 52.3 minutes. Estimated completion in 1 hour 22.78 minutes. 52.3 minutes. Estimated completion in 1 hour 22.78 minutes.

FTS: Initiating transfer between mercury.ultralight.org and FTS: Initiating transfer between mercury.ultralight.org and venus.ultralight.org using #5A-3C1venus.ultralight.org using #5A-3C1

EHA: Transfer initiated…tracking at URL: EHA: Transfer initiated…tracking at URL: fts://localhost/FTS/AE13FF132-FAFE39A-44-5A-3C1fts://localhost/FTS/AE13FF132-FAFE39A-44-5A-3C1

EHA: Reservation placed for MPLS/QoS connection along partial path: EHA: Reservation placed for MPLS/QoS connection along partial path: 3Gbps beginning in 52.2 minutes: duration 60 minutes3Gbps beginning in 52.2 minutes: duration 60 minutes

EHA: Reservation confirmed, rescode #9FA-39AF2E, note: unprotected EHA: Reservation confirmed, rescode #9FA-39AF2E, note: unprotected network section included.network section included.

<…lots of status messages…><…lots of status messages…> FTS: Transfer proceeding, average 1.1 Gbps, 431.3 GB transferredFTS: Transfer proceeding, average 1.1 Gbps, 431.3 GB transferred EHA: Connecting to reservation: tunnel complete, traffic marking EHA: Connecting to reservation: tunnel complete, traffic marking

initiatedinitiated EHA: Virtual pipe active: current rate 2.98 Gbps, estimated EHA: Virtual pipe active: current rate 2.98 Gbps, estimated

completion in 34.35 minutescompletion in 34.35 minutes FTS: Transfer complete, signaling EHA on #5A-3C1FTS: Transfer complete, signaling EHA on #5A-3C1 EHA: Transfer complete received…hold for xsum confirmationEHA: Transfer complete received…hold for xsum confirmation FTS: Remote checksum processing initiated…FTS: Remote checksum processing initiated… FTS: Checksum verified—closing connectionFTS: Checksum verified—closing connection EHA: Connection #5A-3C1 completed…closing virtual pipe with 12.3 EHA: Connection #5A-3C1 completed…closing virtual pipe with 12.3

minutes remaining on reservationminutes remaining on reservation EHA: Resources freed. Transfer details uploading to monitoring nodeEHA: Resources freed. Transfer details uploading to monitoring node EHA: EHA: Request successfully completedRequest successfully completed, transferred 1.2 TB in 1 , transferred 1.2 TB in 1

hour 41.3 minutes (transfer 1 hour 34.4 minuteshour 41.3 minutes (transfer 1 hour 34.4 minutes))

The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities

Shawn McKeeShawn McKee 37

Conclusions

ATLAS is quickly approaching ATLAS is quickly approaching “real” data and our computing “real” data and our computing model has been successfully model has been successfully validated (as far as we have validated (as far as we have been able to take it).been able to take it). Some major uncertainties Some major uncertainties exist, especially around “user exist, especially around “user analysis” and what resource analysis” and what resource implications these may have.implications these may have. There are lots of R&D programs There are lots of R&D programs active in many areas of special active in many areas of special importance to ATLAS (and LHC) importance to ATLAS (and LHC) which could significantly which could significantly strengthen the core modelstrengthen the core model

The challenge will be to The challenge will be to select, integrate, select, integrate,

prototypeprototype and and testtest the R&D developments in time to have the R&D developments in time to have

a meaningful impact upon the ATLAS (or LHC) programa meaningful impact upon the ATLAS (or LHC) programQuestions?Questions?