Einsatz von UNICORE in Rechenzentren · UNICORE UNIform Access to COmputing REsources Site: a...

Preview:

Citation preview

Mit

glie

dd

erH

elm

hol

tz-G

emei

nsc

haf

t

Einsatz von UNICORE inRechenzentren

2017-03-16 Bjorn Hagemeier

Part: About Us

2017-03-16 Bjorn Hagemeier Folie 2

Forschungszentrum Julich and JSC

2017-03-16 Bjorn Hagemeier Folie 3

Forschungszentrum Julich and JSC

2017-03-16 Bjorn Hagemeier Folie 3

JUQUEEN

IBM Blue Gene/Q

28 racks, 458,752 cores

PowerPC A2 a.6GHz

16 cores per node

5.8 Petaflop/s peak

460 TByte main memory

5D network

2017-03-16 Bjorn Hagemeier Folie 4

JURECA

1872 compute nodes

Intel Haswell with 2x12 cores @2.5GHz75 compute nodes equipped with 2NVIDIA K80 GPUsDDR4 memory (2133MHz)

1605 nodes with 128GiB memory128 nodes with 256 GiB memory64 nodes with 512 GiB memory

12 visualization nodes

2 NVidia K40 per nodes10 nodes with 512 GiB memory2 nodes with 1024 GiB memory

Total of 45,216 cores

100 GiB/s storage connection

2017-03-16 Bjorn Hagemeier Folie 5

JUSTJuelich Storage Cluster

IBM GPFS

20.3 PB online storage

220 GB/s

Fileserver for

HPC-Systems: JUQUEEN,JURECADEEP (Dynamical Exascale EntryPlatform)

2017-03-16 Bjorn Hagemeier Folie 6

Tape Libraries

Actual capacity: ∼99 PB

Theoretical capacity: 141 PB (16600x8.5TB)

Tape drives: 48

Libraries: 2

2017-03-16 Bjorn Hagemeier Folie 7

Part: UNICORE

2017-03-16 Bjorn Hagemeier Folie 8

UNICORE As We See It Today

A federation software suite

Secure and seamless access to compute and data resources

Focus on scientific applications and workflows

Complies with typical HPC centre policies

Complete solutions: APIs, clients, services, ...

Java/Python based, supports UNIX, MacOS, Windows andmany resource management systems (Torque, Slurm, SGE, ...)

Long development history (since 1997)

Open source, BSD licensed, visit http://www.unicore.eu

2017-03-16 Bjorn Hagemeier Folie 9

Concepts

UNICORE ≡ UNIform Access to COmputing REsources

Site: a resource such as an HPC system including storage

Job: submitted through JSDL including data staging,resource requirements, executable definition and parameters

Hadoop (Yarn) jobs possible, too, in conjunction with HDFS

Resources: features of sites in terms of capacity and capability

Storages: a view into file systems at a certain base directory(mount point). Can be storage external to the site, e. g.Swift, S3, HDFS, CDMI, XtreemFS

Applications: abstractions of applications hiding site-localspecificities, e. g. installation paths or module activations

Workflows: a series of job executions guided by controlstructures, i. e. visual programming

2017-03-16 Bjorn Hagemeier Folie 10

Architecture

2017-03-16 Bjorn Hagemeier Folie 11

UnicoreMain Services

Compute

TargetSystemFactory

TargetSystem

JobManagement

Reservations

Storage and Data

StorageFactory

StorageManagement

FileTransfer

Metadata

Workflow

Workflowenactment

Task Execution ResourceBroker

Registry

2017-03-16 Bjorn Hagemeier Folie 12

Default Setup

Access to resource manager and file system viaTargetSystemInterface (TSI) daemon installed on the clusterlogin node(s)

2017-03-16 Bjorn Hagemeier Folie 13

Job Execution

2017-03-16 Bjorn Hagemeier Folie 14

Storage Access

The UNICORE Storage ManagementService (“SMS”) provides a filesystem-likeview of data

Typical functions

mkdir, delete, ls, chmod etc

Start tile transfers

Import/export of data from/to the user’slocal machineSend/receive of data from other serversVarious supported file transfer protocols

2017-03-16 Bjorn Hagemeier Folie 15

File Transfers

2017-03-16 Bjorn Hagemeier Folie 16

Metadata Management ServiceMMS

Automatic extraction

Manual editing of metadata

Searching

2017-03-16 Bjorn Hagemeier Folie 17

Applications

General

Identified by name and version

Site specifics

Pre and post commands for environment setup and tear down

Acquire and return licenses

MPI

Support for application metadata

2017-03-16 Bjorn Hagemeier Folie 18

Generic ApplicationsAutomated generation of GUIs

UNICORE Rich Client and Portal support application metadata

Example

<jsdl:Argument Description="Check input file"

Type="boolean"

Default="..."

ValidValues="true false"

DependsOn="..."

Excludes="..."

IsEnabled="false"

IsMandatory="false">+v$CHECK?</jsdl:Argument>

Possible types: string, boolean, int, double, filename, choice.

Used to be defined by site administrators.

2017-03-16 Bjorn Hagemeier Folie 19

User-Defined Applications

Allow mixing system and userdefined applications

Encourage users to play with anddevelop their own applicationdefinitions

Repository of common applicationdefinitions

Realized by merging system anduser specific IDB contributions

Users cannot change a site’sresources and thus not go beyondadministrator limits

2017-03-16 Bjorn Hagemeier Folie 20

Workflow Features

Simple graphs (DAGs)

Workflow variables

Loops and control constructs

while, for-each, if-else

Conditions

Exit code, file existence, filesize, workflow variables

Clients UNICORE Rich clientCommandline client

2017-03-16 Bjorn Hagemeier Folie 21

Authentication and AuthorizationAAI, for short

In addition to its own,home-grown usermanagement solution, aka.XUUDB, UNICORE supportsSAML-based authentication.

PULL and PUSH-mode arepossible

Typically only need a fewattributes

role (user, server, admin),xlogins, groups

2017-03-16 Bjorn Hagemeier Folie 22

UNITY IdMIdentity Relationship Management

Complete solution for identity, federation andinter-federation managementCan serve as SP and IdP at the same time.

Use SAML 2, OAuth 2, OIDC, LDAP as upstream IdPsServe as IdP for SAML 2 (Web SSO, SOAP, PAOS bindings),SAML 2 Web & SOAP UNICORE Profile, OIDC, OAuth 2

2017-03-16 Bjorn Hagemeier Folie 23

UNITY IdMInfrastructures

UNICORE Portal @JSC:https://unicore-portal.fz-juelich.de:8443/

DFN AAI for authenticationStill need proper account at JSC

2017-03-16 Bjorn Hagemeier Folie 24

UNITY IdMInfrastructures

2017-03-16 Bjorn Hagemeier Folie 25

Part: Installation

2017-03-16 Bjorn Hagemeier Folie 26

InstallationGeneral

Latest releases of most important components linked on mainwebsite http://www.unicore.eu/

Detailed download section athttp://www.unicore.eu/download/ contains allcomponents

Packages are hosted on SourceForge

2017-03-16 Bjorn Hagemeier Folie 27

InstallationBasic

Core Server Bundle

https://sourceforge.net/projects/unicore/files/

Servers/Core/

Content

GatewayUNICORE/XRegistryTSIXUUDB

Requirements

OpenJDK 8 or Oracle Java 8Python 2.7 or 3.x for the TSI

2017-03-16 Bjorn Hagemeier Folie 28

InstallationWorkflow

https://sourceforge.net/projects/unicore/files/

Servers/Workflow/

Content

Workflow EngineResource broker aka. “Service Orchestrator”

2017-03-16 Bjorn Hagemeier Folie 29

InstallationFederation

Common Registry

All services need to publish their availability to a commonregistry

Can publish to multiple registries

Clients support multiple registries

Authentication

Individual registrations (certificate)

Identity federation, e. g. via UNITY

2017-03-16 Bjorn Hagemeier Folie 30

Acknowledgements

Most slides shamelessly copied from my colleague BerndSchuller.

Other team members

Valentina Huber, Andre Giesler, Maria Petrova-El Sayed, JedrzejRybicki, Rajveer Saini and many others at JSCKrzysztof Benedyczak, Marcelina Borcz, Rafa l Kluszczynski,Piotr Ba la and others at ICM / Warsaw UniversityRichard Grunzke and others at Technical University DresdenStudents: Burak Bengi, Maciej Golik, Konstantine Muradov... many others who reported bugs, suggested features,contributed code and provided patches

2017-03-16 Bjorn Hagemeier Folie 31

Recommended