30
TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna Zając X# TAT Institute of Computer Science & ACC CYFRONET AGH, Kraków, Poland www.eu-crossgrid.org

TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

CrossGrid After the First Year: A Technical Overview

Marian Bubak, Maciej Malawski, and Katarzyna Zając

X# TATInstitute of Computer Science & ACC CYFRONET

AGH, Kraków, Poland

www.eu-crossgrid.org

Page 2: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Main Objectives

A new category of Grid-enabled applications• Compute- and data-intensive• distributed• near real-time response (person in a loop)• layered

New programming tools Grid more user-friendly, secure and efficient Interoperability with other Grids Implementation of standards

Page 3: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

CrossGrid in a Nutshell

Interactive, Compute and Data Intensive Applications Interactive simulation and visualization of a biomedical system Flooding crisis team support Distributed data analysis in HEP Weather forecasting and air pollution modeling

Tool Environment

MPI code debugging and verification Metrics and benchmarks Interactive and semiautomatic performance evaluation tools

New Generic Grid Services

Globus Middleware

Fabric

DataGrid Services Portals and roaming access Scheduling agents Application and Grid monitoring Optimization of data access

Application Specific ServicesUser Interactive ServicesGrid Visualization Kernel

Page 4: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Key Features of CG Applications Data

• Data generators and databases geographically distributed

• Selected on demand Processing

• Interactive• Requires large processing capacity; both HPC

& HTC Presentation

• Complex data requires versatile 3D visualisation

• Support interaction and feedback to other components

Page 5: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Biomedical Application

Adding small modifications to the proposed structure results in immediate changes in the blood flow.

Online presentation of simulation results via a 3D environment.

The progress of the simulation and the estimated time of convergence should be available for inspection.

LB flowsimulation

VEWDPC

PDA

Visualization

Interaction

Page 6: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Basic Characteristics of Flood Simulation Meteorological

• Intensive simulation (HPC), large input/output data sets, high availability of resources

Hydrological• Parametric simulations

(HTC) may require different models (heterogeneous simulations)

Hydraulic• Many 1-D simulations

HTC, 2-D hydraulic simulations require HPC

Data sources

Meteorological simulations

Hydraulic simulations

Hydrological simulations

Users

Output visualization

Page 7: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Distributed Data Analysis in HEP

Objectives• Distributed data access• Distributed data mining

techniques with neural networks

Issues• Typical interactive requests will

run on o(TB) of distributed data• Transfer/replication times for

the whole data on the order of one hour

• Data transfers once and in advance of the interactive session.

• Allocation, installation and setup the corresponding database servers before the interactive session starts

Replica Manager

Interactive Session Resource Broker

DISTRIBUTEDPROCESSING

DB Installation

Interactive Session

Database server

Interactive Session Manager

Interactive

SessionWorker

InteractiveSession

Worker

InteractiveSessionWorker

Interactive

SessionWorker

InteractiveSession

Worker

PortalXML in/out

On-line output

Page 8: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Weather Forecasting and Air Pollution Modeling

Distributed/parallel code on Grid• Coupled Ocean/Atmosphere Mesoscale

Prediction System• STEM-II Air Pollution Code• Integration of distributed databases

Data mining applied to downscaling weather forecasts

Page 9: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Supporting Tools

1.4Meteo

Pollution

1.4Meteo

Pollution

3.1 Portal & Migrating Desktop

3.1 Portal & Migrating Desktop

ApplicationsDevelopment

Support

2.4Performance

Analysis

2.4Performance

Analysis

2.2 MPI Verification

2.2 MPI Verification

2.3 Metrics and Benchmarks

2.3 Metrics and Benchmarks

App. Spec Services

1.1 Grid Visualisation

Kernel

1.1 Grid Visualisation

Kernel

1.3 DataMining on Grid (NN)

1.3 DataMining on Grid (NN)

1.3 Interactive Distributed

Data Access

1.3 Interactive Distributed

Data Access

3.1Roaming Access

3.1Roaming Access

3.2Scheduling

Agents

3.2Scheduling

Agents

3.3Grid

Monitoring

3.3Grid

Monitoring

MPICH-GMPICH-G

Fabric

1.1, 1.2 HLA and others

1.1, 1.2 HLA and others

3.4Optimization of

Grid Data Access

3.4Optimization of

Grid Data Access

1.2Flooding

1.2Flooding

1.1BioMed

1.1BioMed

Applications

Generic Services

1.3Interactive

Session Services

1.3Interactive

Session Services

GRAMGRAM GSIGSIReplica CatalogReplica CatalogGIS / MDSGIS / MDSGridFTPGridFTP Globus-IOGlobus-IO

DataGridReplica

Manager

DataGridReplica

Manager

DataGrid Job Submission

Service

DataGrid Job Submission

Service

Resource Manager

(CE)

Resource Manager

(CE)

CPUCPU

ResourceManagerResourceManager

Resource Manager

(SE)

Resource Manager

(SE)Secondary

StorageSecondary

Storage

ResourceManagerResourceManager

Instruments ( Satelites,

Radars)

Instruments ( Satelites,

Radars)

3.4Optimization of

Local Data Access

3.4Optimization of

Local Data Access

Tertiary StorageTertiary Storage

Replica CatalogReplica Catalog

GlobusReplica

Manager

GlobusReplica

Manager

1.1User Interaction

Services

1.1User Interaction

Services

Initial version of X# architecture

Page 10: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Project Phases

M 1 - 3: requirements definition and merging

M 4 - 12: first development phase: design, 1st prototypes, refinement of requirements

M 13 - 24: second development phase: integration of components, 2nd prototypes

M 25 - 32: third development phase: complete integration, final code versions

M 33 - 36: final phase: demonstration and documentation

Page 11: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Tools

MPI code debugging and verification Metrics and benchmarks for the Grid environment Grid-enabled Performance Measurement Performance Prediction Component

GridMonitoring

PerformancePrediction

Component

High LevelAnalysis

Component

User Interface and Visualization Component

PerformanceMeasurementComponent

Applications executing

on Grid testbed

Applicationsourcecode

G-PM

RMD PMD

MPI VerificationMARMOT

Benchmarks

Page 12: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

MPI Verification

verifies the correctness of parallel, distributed Grid applications (MPI)

technical basis: MPI profiling interface which allows a detailed analysis of the MPI application

Core Tool

Application or

Test Tool

MPI

AdditionalProcess

(Debug Server)

Client Side

Profiling Interface

Server Side

Page 13: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Benchmark Categories Micro-benchmarks

• For identifying basic performance properties of Grid services, sites, and constellations

Micro-kernels• Generic HPC/HTC kernels,

including general and often-used kernels in Grid environments

Application kernels• Characteristic of

representative CG applications

Portal gbView

gbARCgbControl

gbRMP

Grid Bench suite

SE storage

Embedding

RetrievalInvocation

Invocation/Collection through

GPM

Direct Invocation

Storage/Retrieval

Page 14: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Performance Measurement Tool G-PM

Components:• performance

measurement component (PMC),

• component for high-level analysis (HLAC),

• component for performance prediction (PPC) based on analytical performance models of application kernels,

• user interface and visualization component UIVC.

Interface

Interface

Measurement

OCM-GInterface

UIVC

HLAC

OCM-G

PMC

Page 15: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

User Interactive Service

InteractionGidService

RTIExecGridService

SimulationGridService

Registry

OGSA WSDL RTI Tuple Space functionality description+Dynamic discovery of OGSA Services

Large On-line Data transfer Short Messages and Events

GridFTP SOAP/IIOP

TCP or UDP/IP

VisualisationGridService

enables end users to run distributed simulations in the Grid environment and to steer those simulations in near real time

uses OGSA mechanisms to call external resource brokers, job submission services (efficient and transparent execution of the simulation on the Grid).

Page 16: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Grid Visualization Kernel

addresses the problems of distributed visualization on heterogeneous devices

allows easily and transparently interconnect Grid applications with existing visualisation tools (AVS, OpenDX, VTK, ...)

handles multiple concurrent input data streams

multiplexes compressed data and images efficiently across long-distance networks

GVKPortal Server

GVKVisualization

Planner

Simulation

Init Visualization

Update Visualization

GRAM GASS MDS

GVK Visualization pipeline

Simulation Data

Page 17: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

New Grid Services

Portals and roaming access

Grid resource management

Grid monitoring Optimization of data

access

Page 18: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Roaming Access – Current Design

Web Browser

Line

Roaming

Access Server

LDAPDataBase

Manager

ApplicationPortal Server

DesktopPortal Server

Web Browser

Benchmarks

Portal - easier access and use of the Grid by applications Migrating Desktop - a transparent, independent user

environment Roaming Access Server - responsible for managing user

profiles, job submission, file transfers and Grid monitoring

Command

Replica

Agent Scheduling

Page 19: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Scheduling Agents - Current Design

JSS commands

Job monitoring

Scheduling

Agent

Web Portal

JSS / CondorG

Resource list

Logging&

Bookkeping

Resource

Broker

CE CE CE

scheduling user jobs over the CrossGrid testbed infrastructure,

submition based on Condor-G,

support for sequential and MPI parallel jobs, batch jobs and interactive jobs,

priorities and preferences determined by the user for each job

Page 20: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Application Monitoring

OCM-G Components• Service Managers• Local Monitors

Application processes Tool(s) External name service

• Component discovery

ServiceManager

LocalMonitor

Tool

SharedMemory

OMIS

OMIS

ExternalLocalization

ApplicationProcess

Page 21: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

JiroServices

Information DB

System

Non-invasiveMonitoring

MDSGlobus

MDS info

Jiro info

Instruments

Infrastructure

Static info

PerformanceInformation

Post-processing

Infrastructure monitoring• Invasive monitoring (based on Jiro technology)• Non-invasive monitoring (Santa-G)

Infrastructure Monitoring

Page 22: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Replica Manager GridFTP

Data Access Estimator

Portal

Replica Catalog

HSM

Component-Expert Subsystem

Secondary Storage SystemDisk Cache MO Disk storage

Storage Element

Storage Configuration

Metadata Catalog

Tape Storage

Application

GridFTPPlugin

TRLFM

Components for Component

Expert Subsystem

Replica Manager GridFTP

Data Access Estimator

Portal

Replica Catalog

HSM

Component-Expert Subsystem

Secondary Storage SystemDisk Cache MO Disk storage

Storage Element

Storage Configuration

Metadata Catalog

Tape Storage

Application

GridFTPPlugin

TRLFM

Components for Component

Expert Subsystem

Selection of specialized components best suited for data access operations

Estimation of data access latency and bandwidth inside the storage elements

Faster access to large tape-resident through fragmentation

Data Access Design

Page 23: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Grid Visualization

Kernel

Data Access

Globus

Toolkit

Infrastructure

Monitoring

OCM-G

User Interaction

Services

DataGrid Job

Management

Portal and

Migrating Desktop

DataGridData Management

Benchmarks

Roaming Access

Application

Tools

Scheduling

Agent

Generic Services

Application Specific Services

Supporting Tools

Applications

Current status of CG Architecture

Page 24: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Application-centric view

Portal and RoamingAccess

Application Container

Application Plugin

Data Access

GlobusToolkit

User InteractionServices

Grid VisualizationKernel

Application

Portal and RoamingAccess

Application Container

Application Plugin

Data Access

GlobusToolkit

User InteractionServices

Grid VisualizationKernel

Application

Page 25: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

The Current Testbed

The current CrossGrid testbed is based on:• EDG distribution release 1.2.2 and 1.2.3

(production)• EDG distribution release 1.4.3 (validation)

The current infrastructure permits:• installation of initial prototypes of CrossGrid

software releases(described in M12 Deliverables)

• testing applications using:• Globus and EDG middleware• MPI

• achieving compatibility with DataGrid and therefore extending Grid coverage in Europe

Page 26: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Grid Service

Transient, stateful Web Service (created dynamically)

Described by WSDL Identified by Grid Service Handle (GSH) in

the form of URI Can be queried for configuration and state

in standard way – Service Data mechanism

Page 27: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Why use OGSA

Standards „to be part of the Grid = to implement

OGSA Grid protocols” Interoperability in heterogeneous

environments Possible contribution to future Grid

activities

Page 28: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Grid Services – where?

Dynamic service creation and lifetime management to control the state of some process, e.g.:• user session in a portal• data transfer• running simulation.

Service data model can be applied to monitoring systems that can be used as information providers for other services.

Service discovery – to solve the bootstrap problem:• to connect the modules of a distributed simulation• to connect the application to a monitoring system

Page 29: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Steps towards OGSA

Using Web Service interfaces and XML where possible

Experimenting with prototyping services using OGSA alpha releases

Applying Grid Service extensions to services

Solving GT2 - GT3 transition and compatibility issues

Page 30: TAT CrossGrid Yearly Review, Brussels, March 12, 2003 CrossGrid After the First Year: A Technical Overview Marian Bubak, Maciej Malawski, and Katarzyna

TAT

CrossGrid Yearly Review, Brussels, March 12, 2003

Summary

Achievements of the first project year :• Software Requirements Specifications

together with use cases written• CrossGrid Architecture defined• Detailed Design documents for tools and

new Grid services (OO approach, UML) written

• First prototype of software running and documented

• Detailed description of the test and integration procedures created

• Testbed set up