NorduGrid Architecture and tools CHEP2003 – UCSD Anders Wäänänen waananen@nbi.dkwaananen@nbi.dk

Preview:

Citation preview

NorduGrid Architecture and tools

CHEP2003 – UCSD

Anders Wäänänen waananen@nbi.dk

2Anders Wäänänen

NorduGrid Architecture

NorduGrid project

Launched in spring of 2001, with the aim of creating a Grid infrastructure in the Nordic countries.

Idea to have a Monarch architecture with a Nordic tier 1 center

Partners from Denmark, Norway, Sweden, and Finland

Initially meant to be the Nordic branch of the EU DataGrid (EDG) project

3 full-time researchers with few externally funded

3Anders Wäänänen

NorduGrid Architecture

Motivations

NorduGrid was initially meant to be a pure deployment project

One goal was to have the ATLAS data challenge run by May 2002

Should be based on the the Globus Toolkit™

Available Grid middleware: The Globus Toolkit™

A toolbox – not a complete solution

European DataGrid software Not mature in the beginning of 2002 Architecture problems

4Anders Wäänänen

NorduGrid Architecture

Architecture requirements

No single point of failure

Should be scalable

Resource owners should have full control over their resources

As few site requirements as possible: Local cluster installation details should not be dictated

Method, OS version, configuration, etc…

Compute nodes should not be required to be on the public network

Clusters need not be dedicated to the Grid

6Anders Wäänänen

NorduGrid Architecture

NorduGrid components

Grid Manager – Mange Grid jobs in cluster Job control and data management

Information system Patched Globus MDS with improved schema

User interface Job submission and personal broker

Grid monitor Web based interface to information system

Globus replica catalog

7Anders Wäänänen

NorduGrid Architecture

Grid manager features 1

Staging of executables and input/output data

Supported protocols: Local files, gridftp, ftp, http(s), Replica Catalog, Replica Location

Services

Data transfer control including retries

Caching of input data Cache size control Private (per UNIX user) and shared caches Data access control based on user’s credentials

Support for runtime environment (eg. Software installations)

Full job information available for auditing, accounting and debugging

8Anders Wäänänen

NorduGrid Architecture

Grid manager features 2

Globus building blocks used GridFTP – fast, reliable and secure data access

GASS transfer – http(s) like data access protocol

Replica catalog

Replica Location Service (with EDG)

RSL – expandable Resource Specification Language

Limitations Data handling is currently only supported at job start and job

end when cluster nodes are on a private network

9Anders Wäänänen

NorduGrid Architecture

Grid Manager architecture

CacheJob sessiondirectory

Link or copy

submission

stagein

stageout

NorduGridgridftp server

downloaderGrid

Manageruploader

File access

Job control

Job sessiondirectory

Computingnode

LRMS

Frontend

NFS

LRMS

10Anders Wäänänen

NorduGrid Architecture

User interface

The NorduGrid user interface provides a set of commands for interacting with the grid

ngsub – for submitting jobs ngstat – for states of jobs and clusters ngcat – to see stdout/stderr of running jobs ngget – to retrieve the results from finished jobs ngkill – to kill running jobs ngclean – to delete finished jobs from the system ngcopy – to copy files to, from and between file servers and replica

catalogs ngremove – to delete files from file servers and RC’s

11Anders Wäänänen

NorduGrid Architecture

Information system

The nerve system of the Grid - information is a critical resource!

Complications: Large number of resource -> scalability

Heterogeneous resources -> characterization

Decentralized

Efficient access to dynamic data

Quality and reliability of information

Compromise between: Up to date data vs. load on the Grid

12Anders Wäänänen

NorduGrid Architecture

NorduGrid information system

Use Globus MDS

Improved schemas with natural representation of resources: Clusters (queues, jobs and users)

Storage elements

Replica Catalogs

Use efficient providers

Each resource runs a GRIS

GRIS’s are organized into a dynamic country based GIIS hierarchy.

Have enough information to do brokering

13Anders Wäänänen

NorduGrid Architecture

DIT of a cluster

cluster

queue

jobs users

job-01 job-02 job-03 user-01 user-02queue

jobs users

job-04 job-05 user-02 user-03user-01

14Anders Wäänänen

NorduGrid Architecture

Cluster entry

15Anders Wäänänen

NorduGrid Architecture

DIT of a cluster

cluster

queue

jobs users

job-01 job-02 job-03 user-01 user-02queue

jobs users

job-04 job-05 user-02 user-03user-01

16Anders Wäänänen

NorduGrid Architecture

Queue entry

17Anders Wäänänen

NorduGrid Architecture

DIT of a cluster

cluster

queue

jobs users

job-01 job-02 job-03 user-01 user-02queue

jobs users

job-04 job-05 user-02 user-03user-01

18Anders Wäänänen

NorduGrid Architecture

Job entry

job status monitoring = information system query

19Anders Wäänänen

NorduGrid Architecture

Another job entry

- the job entry is generated on the execution cluster- when the job is completed and the results are retrieved the job disappears from the information system

20Anders Wäänänen

NorduGrid Architecture

DIT of a cluster

cluster

queue

jobs users

job-01 job-02 job-03 user-01 user-02queue

jobs users

job-04 job-05 user-02 user-03user-01

21Anders Wäänänen

NorduGrid Architecture

Personalized information

user based information is essential on the Grid:

users are not really interested in the total number of cpus of a cluster, but how many of those are available for them!

number of queuing jobs are irrelevant if the submission gets immediately executed

instead of total disk space the user's quota is interesting

nordugrid-authuser objectclass freecpus

diskspace

queuelength

22Anders Wäänänen

NorduGrid Architecture

User entry

23Anders Wäänänen

NorduGrid Architecture

GIIS Hierarchy

Hierarchy of GRISes/GIISes

24Anders Wäänänen

NorduGrid Architecture

Grid Montior

25Anders Wäänänen

NorduGrid Architecture

Brokering & job submission

● Searches through the NorduGrid Testbed for available clusters

● Loops through all the clusters and selects those queues (possible targets) where:● The user is authorized to run

● Job requirements can be satisfied

● Selects a job destination from the matching targets – Randomly selects among the free resources (where user-

freecpus>0)

– In case there are no free matching resources some of the “load” attributes (i.e. user-queuelength) are taken into account

27Anders Wäänänen

NorduGrid Architecture

NorduGrid job submission

RC

RSL

MDSGrid

Manager

GatekeeperGridFTP

RSL

RSL

28Anders Wäänänen

NorduGrid Architecture

Quick client installation/job run

As a normal user: retrieve nordugrid-standalone-0.3.17.rh72.i386.tgz

tar xfz nordugrid-standalone-0.3.17.rh72.i386.tgz

cd nordugrid-standalone-0.3.17

source ./setup.sh

Maybe get a certificate

grid-cert-request

install certificate per instructions

grid-proxy-init

ngsub '&(executable=/bin/echo)(arguments="Hello World")‘

29Anders Wäänänen

NorduGrid Architecture

Future development or integration

Better Authorization

Accounting

Optimize brokering

More intelligent data management and replication service

Handle network requests from running jobs on “private” networks

Grid portal interface – in testing

Move towards Grid services and improved community compatibility

30Anders Wäänänen

NorduGrid Architecture

Future

The committee of Nordic natural science ministers NOS-N has decided to fund a new common Nordic Grid Project based on the work done by the NorduGrid project. This project should work on a proposal/recommendation for a Nordic DataGrid facility.

Support for the toolkit in the future

This will be supported in each country by local Grid initiatives

Collaboration with the Nordic computing centers have already been initiated with the deployment of the toolkit on several large centers.

Use it for future ATLAS production in the Nordic countries

Move towards OGSA and better community compatibility

31Anders Wäänänen

NorduGrid Architecture

Resources

Documentation and source code are available for download

Main Web site: http://www.nordugrid.org/

Repository ftp://ftp.nordugrid.org/pub/nordugrid/

32Anders Wäänänen

NorduGrid Architecture

The NorduGrid core group

Александр Константинов

Balázs Kónya

Mattias Ellert

Оксана Смирнова

Jakob Langgaard Nielsen

Trond Myklebust

Anders Wäänänen

Recommended