35
DataTAG Project IST Copenhagen, IST Copenhagen, 4/6 Nov- 2002 4/6 Nov- 2002 Slides collected by: Cristina Vistoli Slides collected by: Cristina Vistoli INFN-CNAF INFN-CNAF

DataTAG Project IST Copenhagen, 4/6 Nov- 2002 Slides collected by: Cristina Vistoli INFN-CNAF

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

DataTAG Project

IST Copenhagen, IST Copenhagen, 4/6 Nov- 20024/6 Nov- 2002

Slides collected by: Cristina Vistoli INFN-Slides collected by: Cristina Vistoli INFN-CNAFCNAF

Nov 2002 The DataTAG Project

WP1:WP1: Establishment of a high performance

intercontinental Grid testbed CERN

WP2:WP2: High performance networking PPARC

WP3WP3 Bulk data transfer validations and application

performance monitoring UvA

WP4WP4 Interoperability between Grid domains INFN

Nov 2002 The DataTAG Project

Project focusProject focus

Grid related network research (Grid related network research (WP1, WP1, WP2, WP2, WP3)WP3)

High Performance Transport protocols

Inter-domain QoS

Advance bandwidth reservation

Interoperability between European Interoperability between European and US Grids (WP4)and US Grids (WP4)

N.B.N.B. In principle open to other EU Grid projects as In principle open to other EU Grid projects as well as ESA for demonstrationswell as ESA for demonstrations

NLNLSURFnet

CERN

UKUKSuperJANET4

AbileneAbilene

ESNETESNET

MRENMREN

ITITGARR-B

GEANT

NewYork

STAR-TAP

STAR-LIGHT

DataTAG project

Major 2.5/10 Gbps circuits between Europe & USA

FRFRINRIA ATRIUM/

VTHD

ATRIUM/VTHD

3*2.5G3*2.5G

2.5G2.5G

10G10G

Nov 2002 The DataTAG Project

WP1: StatusWP1: Status 2.5 Gbps transatlantic lambda between CERN 2.5 Gbps transatlantic lambda between CERN

(Geneva) and StarLight (Chicago)(Geneva) and StarLight (Chicago) Circuit in place since August 20

Part of the Amsterdam-Chicago-Geneva “wave Part of the Amsterdam-Chicago-Geneva “wave triangle”triangle”

Phase 1 with Cisco ONS15454 layer 2 Muxes (August-September – iGRID2002)

Phase 2 with Cisco 7606 routers (October) Phase 3 with Alcatel 1670 layer2 Muxes (November)

Also extending to the French optical testbed Also extending to the French optical testbed VTHD (2.5Gps to INRIA/Lyon)VTHD (2.5Gps to INRIA/Lyon)

And through VTGD to EU/ATRIUM

And, of course, to GEANTAnd, of course, to GEANT

Nov 2002 The DataTAG Project

Multi-vendor testbed with layer3 Multi-vendor testbed with layer3 & layer2 capabilities& layer2 capabilities

Alcatel

CERN (Geneva)

Alcatel

ESnet

Juniper Juniper

STARLIGHT (Chicago)

M

GEANT

Abilene

Starlight

Cisco 6509Research2.5Gbp

s

1.25Gbps

Cisco Cisco

INFN (Bologna)

M

M= Layer 2 Mux

GBE

Nov 2002 The DataTAG Project

Testbed Testbed deployment deployment

statusstatus multi-vendor testbed with layer2 multi-vendor testbed with layer2

and layer 3 capabilities, and layer 3 capabilities,

interesting results already achievedinteresting results already achieved Triumf-CERN 2Gbps lightpath demo (disk to

disk) Terabyte file transfer (Monte Carlo simulated

events)

Single stream TCP/IP (with S.Ravot/Caltech patches)

8 Terabytes in 24 hours (memory to memory)

Nov 2002 The DataTAG Project

Phase I (iGRID2002)Phase I (iGRID2002)

Nov 2002 The DataTAG Project

Phase II (October 2002)Phase II (October 2002)Generic configurationGeneric configuration

GigE switch

C7606

Servers

C76062.5Gbps2.5Gbps

CERNCERN StarLightStarLightServers

GigE switch

Nov 2002 The DataTAG Project

Phase III (November 2002)Phase III (November 2002)

GigE switch

Routers

Servers

A7770

C7606

J-M10

GigE switchServers A1670Multiplexer

n*GigESTARLIGHT

CERN

Ditto

Amsterdam

C-ONS15454

GEANTGEANT

VTHD

AbileneESNetCanari

e

Nov 2002 The DataTAG Project

WP2WP2

The deployment of for Grid The deployment of for Grid applications across multiple domains applications across multiple domains for the optimal utilization of network for the optimal utilization of network resources. Network services include resources. Network services include advance reservation and advance reservation and differentiated packet treatment,  differentiated packet treatment,  while optimal utilization requires the while optimal utilization requires the tuning of existing transport protocols, tuning of existing transport protocols, elements of traffic engineering and elements of traffic engineering and the identification and test of new the identification and test of new ones. ones.

Task 1 Transport applications for high bandwidth-delay connections

Task 2 End-to-end inter-domain QoS Task 3 Advanced Reservation

Nov 2002 The DataTAG Project

WP2.1WP2.1

Transport applicationsTransport applications:: The goal of this task is the demonstration

and deployment of high performance transport applications for efficient and reliable data exchange over highbandwidth-delay connections.

Demonstration of sustained,reliable and robust multi-gigabit/s data replication over long distances is in itself an important goal, however it is also essential to ensure that these applications are deployed within the intercontinental testbed context, and usable from the middleware layer by applications.

Nov 2002 The DataTAG Project

WP2.1WP2.1

In addition there are several other In addition there are several other related areas which will be investigated:related areas which will be investigated:

1. Application-controllable Transport Protocols (ATPs) , for example, Parallel TCP (PTCP) and Forward Error Corrected UDP (FEC-UDP) but not limited to those;

2. Protocols for high frequency but relatively small bulk data transfers: Cached Parallel TCP (CPTCP);

3. Enhanced TCP implementations in combination with (enhanced) congestio control mechanisms , for example Explicit Congestion Notification (ECN);

4. The potential for the use of bandwidth brokers together with a review of their current specification/implementation status;

Nov 2002 The DataTAG Project

WP2.2 WP2.2

End-to-end inter-domain QoSEnd-to-end inter-domain QoS ””Why QoS ? Why QoS ? Grid Traffic needs: Grid Traffic needs:

Critical Data Access. Services lookup across WAN Interactive and Video applications Differentiated Services in the

testbed…...BUTBUT…….. QoS mechanisms just work inside the domain.

Demonstration of : QoS propagation in more than one domain. QoS available from Grid middleware

Nov 2002 The DataTAG Project

jan.2003jan.2003

Nov 2002 The DataTAG Project

Wp2.3Wp2.3

Advance Reservation:Advance Reservation: Evaluation of the different advance

reservation approaches and their interoperability between Grid domains. This should lead to the deployment of an advance reservation service in the international testbed.

From a functional point of view the From a functional point of view the main blocks that have to be studied main blocks that have to be studied and defined are:and defined are:

the user/application protocol the admission control algorithm the Intra-domain protocol the Inter-domain protocol

The DataTAG Project Nov 2002

Example of Generic AAA Architecture – RFC2903

ApplicationSpecificModule

BandwidthBroker

RuleBasedEngine

PolicyRepository

ApplicationSpecificModule

RuleBasedEngine

PolicyRepository

Users

ApplicationSpecificModule

RuleBasedEngine

PolicyRepository

ContractsBudgets

Registration Dept.Purchase Dept.

Bandwidth Provider

AAAServer

AAAServer

AAAServer

(Virtual) User Organization

QoS EnabledNetwork

Use

r

Service

Service Organization

The DataTAG Project Nov 2002

802.1QVLANSwitch

EnterasysMatrix E5

A

B

C

D

802.1QVLANSwitch

EnterasysMatrix E5

1 GB SX

AAA

192.168.1.5

iGrid2002

Policy DBAAARequest

192.168.1.6

192.168.2.3

192.168.2.4

Generic AAA (RFC2903) based Bandwidth on Demand

The DataTAG Project Nov 2002

Upcomming work:

1) Separate ASM and RBE and allow ASM’s to be loaded/unloaded dynamically.

2) Implement pre-allocation mechanisms (based on GARA – collaboration with Volker Sander).

3) Create ASM for other B/W manager (e.g. Alcatel BonD, Cisco CTM, Level-3 Ontap)

4) Create ASM to talk to other domain: OMNInet5) Allow RBE’s to talk to each other (define messages).6) Integrate BoD AAA client into middleware eg by

allowing integration with GridFTP and integration with VOMS authentication and user authorization system.

7) Build WS interface abstraction for pre-allocation and subsequent usage.

Nov 2002 The DataTAG Project

WP3 objectivesWP3 objectives

Bulk data transfer and application Bulk data transfer and application performance monitoring:performance monitoring:

innovative monitoring tools are required to measure and understand the performance of high speed intercontinental networks and their potential on real Grid application.

Nov 2002 The DataTAG Project

Tasks in WP3Tasks in WP3

Task 3.1. Performance validation (month 1-12)Task 3.1. Performance validation (month 1-12) Create, collect, test network-tools to cope with the

extreme Lambda environment (high RTT, BW) measure basic properties and establish a baseline

performance benchmark

Task 3.2 End user performance Task 3.2 End user performance validation/monitoring/optimization (month 6-validation/monitoring/optimization (month 6-24)24)

Use “out of band” tools to measure and monitor what performance a user in principle should be able to reach

Task 3.3 Application performance validation, Task 3.3 Application performance validation, monitoring and optimization (month 6-24)monitoring and optimization (month 6-24)

Use diagnostic libraries and tools to monitor and optimize real applications to compare their performance with task 3.2 outcome.

Nov 2002 The DataTAG Project

Task 3.1 experiencesTask 3.1 experiences using using

Netherlight SURFnet Lambda Netherlight SURFnet Lambda AMS-CHIAMS-CHI

full duplex GE full duplex GE 2.5 Gbps SDH , 2.5 Gbps SDH , 100 ms RTT100 ms RTT

single stream TCP: max. throughput single stream TCP: max. throughput 80 - 150 Mbps, dependent on stream 80 - 150 Mbps, dependent on stream durationduration

similar for BSD and Linux and similar for BSD and Linux and different adaptorsdifferent adaptors

UDP measurements show effects of UDP measurements show effects of hardware buffer-size in ONS hardware buffer-size in ONS equipment when assigning lower equipment when assigning lower SDH bandwidths,SDH bandwidths,see: see:

www.science.uva.nl/~wsjouw/datatag/lambdaperf.html

Nov 2002 The DataTAG Project

Summary of statusSummary of status

Tools have been investigated, selected and Tools have been investigated, selected and where necessary adapted to benchmark and where necessary adapted to benchmark and characterize Lambda networkscharacterize Lambda networks

New tools appear and need to be studied to New tools appear and need to be studied to add to the setadd to the set

Influence layer 1 and 2 infrastructure Influence layer 1 and 2 infrastructure properties on TCP throughput is under properties on TCP throughput is under study on SURFnet Lambda testbed and study on SURFnet Lambda testbed and NetherLightNetherLight

Monitoring setup is underway, inclusion of Monitoring setup is underway, inclusion of WP7 toolset is next step WP7 toolset is next step

Application performance monitoring and Application performance monitoring and optimization should start soonoptimization should start soon

Nov 2002 The DataTAG Project

WP4WP4

Interoperability between Grid Interoperability between Grid domains domains

– To address issues of middleware interoperability between the European and US Grid domains and to enable a selected set of applications to run on the transatlantic Grid test bed.

Nov 2002 The DataTAG Project

FRAMEWORK AND FRAMEWORK AND RELATIONSHIPSRELATIONSHIPS

US partner: iVDGLUS partner: iVDGL Grid middleware:Grid middleware:

DataGRID Release 1 GriPhyN/PPDG VDT v1

Programme: GLUEProgramme: GLUE

Applications: Applications: LHC experiments: Alice, Atlas, CMS Virgo (Ligo) CDF, D0, BaBar

Plan for each experiment Plan for each experiment

Nov 2002 The DataTAG Project

FrameworkFrameworkUS Part

DataTAG-WP4

iVDGL

DataGRID Griphyn/PPDG

HEP experiments

EU Part

HICB

GLUE

Nov 2002 The DataTAG Project

Interoperability Interoperability approachapproach

Grid services scenario and basic interoperability Grid services scenario and basic interoperability requirementsrequirements

Common VO scenario for the experiments in EU and US Set of mechanism application-independent as basic grid

functions : accessing storage or computing resources in a grid environment

requires resource discovery and security mechanisms, requires logic for moving data reliably from place to place, scheduling sets of computational and data movement operations, monitoring the entire system for faults and responding to those faults.

specific Data Grid mechanisms as built on top of a specific Data Grid mechanisms as built on top of a general basic grid infrastructure. general basic grid infrastructure.

These basic protocols are the basis for These basic protocols are the basis for interoperability between different grid domains. interoperability between different grid domains. One implementation of them, representing the de One implementation of them, representing the de facto standards for Grid system, is the Globus facto standards for Grid system, is the Globus toolkit, that has been adopted by DataGRID and toolkit, that has been adopted by DataGRID and GriPhyN/PPDG.GriPhyN/PPDG.

This situation has certainly facilitated the This situation has certainly facilitated the interoperability approach definition.interoperability approach definition.

The DataTAG Project Nov 2002

grid architectural grid architectural modelmodelGrid request

submission

collective

Internet protocols , GSI/X.509, GSS-API, …

…Computers …..Storage Systems …Networks ….

Resource

Connectivity

Compute Service

Storage Service

Network Service

Catalog Service

Access Protocols

Information Protocols

Scheduler

Replicamanagement

Information Discovery

Security/PolicyVO authorizationOnline CA

fabric

Nov 2002 The DataTAG Project

Grid resource accessGrid resource access

The first and most important The first and most important requirement for grid-interoperability requirement for grid-interoperability in this scenario is the need to access in this scenario is the need to access grid-resources wherever they are with grid-resources wherever they are with

common protocols, common security, authentication and

authorization basic mechanisms, and common information describing grid resources.

The Globus Toolkit provides access The Globus Toolkit provides access protocols (GRAM), information protocols (GRAM), information protocols (GIS) and the public key protocols (GIS) and the public key infrastructure (PKI)-based Grid infrastructure (PKI)-based Grid Security Infrastructure (GSI). Security Infrastructure (GSI).

Nov 2002 The DataTAG Project

User oriented User oriented requirementsrequirements

On top of the core services several On top of the core services several flavours of grid scheduling, job flavours of grid scheduling, job submission, resource discovery and submission, resource discovery and data handling can be developed that data handling can be developed that must guarantee interoperability with must guarantee interoperability with the core services and their the core services and their coexistence within the same grid coexistence within the same grid domain. These services together domain. These services together with sophisticated metadata with sophisticated metadata catalogue, virtual data system etc., catalogue, virtual data system etc., are of particular interest for the HEP are of particular interest for the HEP experiment applications. experiment applications.

Nov 2002 The DataTAG Project

CORE servicesCORE services

GLUE PROGRAMME: first results:GLUE PROGRAMME: first results: Information System

CE schema defined and implemented SE on going (almost complete) NE not yet started

Authorization SystemVO/LDAP server in commondiscussion and comparison between VOMS and

CAS on going

Resource discovery systems review Resource discovery systems review for future plans.for future plans.

New network service (bandwith on New network service (bandwith on demand) as grid resource, NE, with demand) as grid resource, NE, with interoperable AA mechanism. interoperable AA mechanism.

Nov 2002 The DataTAG Project

Grid optimization or Grid optimization or collective servicescollective services

State of the art in DataGRID and State of the art in DataGRID and GriPhyN/PPDGGriPhyN/PPDG

how to schedulehow to schedule how to access distributed and how to access distributed and

replicated datareplicated data EU-DataGRID and US-GriPhyn/PPDG EU-DataGRID and US-GriPhyn/PPDG

projects provide different solutions projects provide different solutions to the above issues as detailed to the above issues as detailed below.below.

The DataTAG Project Nov 2002

Joint EU-US grid demosJoint EU-US grid demosIST2002IST2002 4-6 November, Copenaghen4-6 November, Copenaghen

SC2002SC2002 16-22 November, Baltimore16-22 November, Baltimore

Goals:Goals: Basic collaboration between European and US grid projects Interoperability between grid domains for applications submitted

by users from different virtual organizations Controlled use of shared resources subject to agreed policy Integrated use of heterogeneous resources from iVDGL and EDG

testbed domains Infrastructure:Infrastructure:WEB site: WEB site: http://www.ivdgl.org/demohttp://www.ivdgl.org/demo Hypernews: Hypernews:

http://atlassw1.phy.bnl.gov/HyperNews/get/intergrid.htmlhttp://atlassw1.phy.bnl.gov/HyperNews/get/intergrid.html Mailing list:Mailing list: [email protected]@datagrid.cnr.it Archive:Archive: http://web.datagrid.cnr.it/hypermail/archivio/igdemohttp://web.datagrid.cnr.it/hypermail/archivio/igdemo GLUE testbed with common schemaGLUE testbed with common schema VO (DataTAG and iVDGL) LDAP Servers in EU and USVO (DataTAG and iVDGL) LDAP Servers in EU and US PACMAN cache with software distribution (DataTAG or iVDGL)PACMAN cache with software distribution (DataTAG or iVDGL) Planning document outlinePlanning document outline

Nov 2002 The DataTAG Project

Joint EU-US grid Joint EU-US grid demosdemos

GLUE testbed with Common GLUE schema and GLUE testbed with Common GLUE schema and authorization/authentication tools.authorization/authentication tools.

EDG 1.2 + extensions, VDT 1.1.3 + extensions “Old” authentication/authorization tools : VO LDAP servers, mkgridmap, etc. We rely on the availability of the new GLUE schema, RB and Information

Providers ?Concentrate on visualization: CMS/GENIUS, ATLAS/GRAPPA, ?Concentrate on visualization: CMS/GENIUS, ATLAS/GRAPPA,

EDG/MAPCENTER (EDG/WP7), Farm Monitoring tools (EDG/WP4), EDG/MAPCENTER (EDG/WP7), Farm Monitoring tools (EDG/WP4), iVDGL/GANGLIA, DataTAG/NAGIOSiVDGL/GANGLIA, DataTAG/NAGIOS

Use WEB portals for job submission (CMS/Genius, ATLAS/Grappa) Provide a World Map of the sites involved (EDG-WP7/MapCenter,

DataTAG/Nagios) Monitor job status and statistics (EDG-WP4, DataTAG/Nagios). In Nagios

implemented “top users”, “top applications”, “total resources”, “averages over time”, “per VO”,…

Monitor farms (EDG/WP4, DataTAG/Nagios) Developing plugins/sensors for WP1/L&B info (only on EDG part) Services Monitoring (EDG/WP4, DataTAG/Nagios [using MDS info])

CMS and ATLAS demoCMS and ATLAS demo ATLAS simulation jobs, GRAPPA modified to use both RB or explicit resources Pythia & CMSIM simulation jobs submitted to intercontinental resources with

IMPALA/BOSS interfaced to VDT/MOP, EDG/JDL, Genius portal. Definition of the application demos in progress.

Nov 2002 The DataTAG Project

LHC experimentsLHC experiments

The activity of the WP4’s task has to be focused on what is already deployed and used by the LHC experiments for their current needs.

A coordinated plan must be settled to further develop and integrate the current tools (both from GRID Projects and from specific Experiments software) into a common (and/or interoperable) scenario.

Experiments individual plans were discussed with each of them and agreed for the strategy to be followed. High-level requirements and areas of intervention have been identified through many discussions and meetings with all of the LHC Experiments.

One of the first results that came out is that dedicated test layouts to experiment integration of specific components must be deployed. The scope of each test layout and the results expected have been preliminarily defined and some of them are already active (having agreed with the interested experiments the details and the resources needed).

Those test layouts are already in progress and mostly concerning CMS and ATLAS. ALICE has also defined its goals and is rapidly ramping-up.