Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Introduction to
Grid Technologies
Cameron Kiddle
Grid Systems Architect, WestGrid
Research Fellow, Grid Research Centre (GRC)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 2
Presentation Outline/Goals
� Provide an overview of grid computing
� Introduce various grid computing
technologies
� Identify the grid computing technologies
currently supported by WestGrid
� Demonstrate how researchers can benefit
from use of these technologies via examples
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 3
What is Grid Computing?
� Many different definitions/uses� computational grids, data grids, desktop grids, campus
grids, sensor grids, access grids
� Coordinated sharing of resources that can span multiple administrative domains
� Related terms� utility computing, computing on demand, cloud computing,
e-Science, e-Infrastructure, cyberinfrastructure
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 4
Grid Computing Goals
� Accessibility� Providing users with easier access to more resources
� Collaboration� Enabling large scale collaborations
� Utility� Providing on demand access to computing resources
similar to public utilities such as electricity or water
� Transparency� Providing users access to computing resources without the
need to know how or where computations are taking place
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 5
Virtual Organization (VO)
� a group of people, typically spanning institutional
and regional boundaries, that share resources to
collaborate on a common project
Resources Shared by
Virtual Organization X
Resources Shared by
Virtual Organization Y
Domain A
Domain B Domain C
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 6
Example Grid Projects
development of standards and infrastructure to share and
analyze astronomical archives from around the world
International Virtual
Observatory Alliance (IVOA)
http://www.ivoa.net/
development of a grid-enabled biomolecular simulation
database to make results more accessible to the biological
community
BioSimGrid
http://www.biosimgrid.org/
a US national network of 15 facilities to study the impact of
earthquakes on buildings, bridges, etc.
Network for Earthquake
Engineering Simulation (NEES)
http://www.nees.org/
data storage and analysis infrastructure for the high energy
physics community using the Large Hadron Collider (LHC) at
CERN (ATLAS Tier-1 site at TRIUMF in British Columbia)
LHC Computing Grid
http://lcg.web.cern.ch/
DescriptionName
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 7
Grid Technologies
� Grid Middleware
� Security
� VO Management Services
� Information Services
� Data Management Services
� Execution Management Services
� Web Portals/Scientific Gateways
� Virtualization
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 8
Grid Middleware
� The layer between users/applications and grid resources that glues everything together
� Example grid middleware� Globus Toolkit (GT)
� GT2 – pre-standards
� GT4 – standards based (Web Services)
� UNICORE
� gLite
� ARC
� NAREGI
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 9
Security
� Authentication� X.509 certificates (IETF)
� Used to identify and authenticate users and services
� Based on public key cryptography
� Issued and signed by a certificate authority
� Provides global name space
� Enables single sign-on
� Authorization� grid-mapfile
� Maps distinguished names (found in certificates) of authorized users to local user names (e.g., unix login)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 10
VO Management Services
� Services for managing membership and roles
within a Virtual Organization
� Helps simplify user account management
� Examples
� VOMS, GUMS, PRIMA
� Shibboleth, GridShib, myVocs
� Other – CAS, Akenti, PERMIS, SHEBANGS
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 11
Information Services� Provide information about resources, policy, services and
applications to tools and users
� Information models� DMTF Common Information Model (CIM)
� GLUE Schema
� GRC Model Schema
� Example services� Monitoring and Discovery System (MDS)
� MDS 2 – LDAP based
� MDS 4 (WS MDS) – Web Service based
� Relational Grid Monitoring Architecture (R-GMA)
� Berkeley Database Information Index (BDII)
� Universal Description, Discovery, and Integration (UDDI)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 12
Data Management Services� Data transfer
� GridFTP
� Reliable File Transfer (RFT)
� Data replication � Replica Location Service (RLS)
� Metadata management� Metadata Catalog Service (MCS)
� Higher level data management services� Data Replication Service (DRS)
� Storage Resource Broker (SRB)
� i Rule Oriented Data Systems (iRODS)
� Proactive Data Management System (PDMS)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 13
Execution Management Services� Handle placement, provisioning and lifetime management of
jobs
� Job submission� Submission of jobs to different types of resources
� Grid Resource Allocation and Management (GRAM)
� Meta-schedulers� Higher level schedulers that manage and distribute jobs between
different local schedulers
� Examples: Condor-G, CSF, GridWay, Moab Grid Scheduler (Silver)
� Workflow managers� Automate the management and submission of a set of jobs that have
various ordering dependencies
� Examples: DAGMan, Kepler, Triana, Taverna, Pegasus
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 14
Web Portals/Scientific Gateways
� Provide Web-based access to computing resources for communities of users
� Web portal development software� WebSphere
� GridSphere
� Web 2.0 technologies� Social networking (Facebook) , wikis (Wikipedia), blogs, …
� Example portals� nanoHUB
� myExperiment
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 15
Virtualization
� Can transform a single physical machine into multiple virtual machines (VMs) each with their own OS and software stack
� Virtualization software� Xen, VMWare
� Support allocation, deallocation, suspension and migration of VMs
� Benefits� custom environments (root access), resource
consolidation, system maintenance without disruption
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 16
WestGrid and Grid
� Is WestGrid a computational grid?
� Provides grid enabled resources
� Security services
� Data transfer tools
� Job submission services
� WestGrid resources can be part of
computational grids
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 17
GT4-based Grid Environment
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 18
Grid Services Status
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 19
Grid Services Supported in WestGrid
� Security Services
� GSI (Grid Security Infrastructure), X.509 certificates,
GSI-OpenSSH, MyProxy
� Information Services
� WS MDS, WebMDS
� Data Management Services
� GridFTP, RFT
� Execution Management Services
� WS GRAM, Condor-G (by request), DAGMan (by request)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 20
Certificates in WestGrid
� A user automatically receives a certificate
when applying for an account
� Certificate and password protected private
key is stored in users $HOME/.globus/
directory on their home site
� Certificates are issued by Grid Canada
� Certificate must be renewed annually
(users will receive 60 day notice by e-mail)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 21
GSI-OpenSSH
� GSI enabled version of OpenSSH
� Provides a single sign-on remote login and
file transfer service
� Command line tools:
� gsissh – GSI enabled ssh
� gsiscp – GSI enabled scp
� gsisftp – GSI enabled sftp
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 22
MyProxy
� Developed by NCSA (National Center for
Supercomputing Applications)
� Credential repository
� Allows proxy credential to be received from any
machine
� Can allow trusted services to renew proxy
credentials
� WestGrid MyProxy Server - myproxy.westgrid.ca
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 23
WS MDS
� Web Services version of the Monitoring and
Discovery System
� Index Service� Collects data and provides a query/subscription
interface to the data
� Can create hierarchy of index services
� Trigger Service� Collects data and takes actions based on the data
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 24
GRC Model Schema
� Models developed to describe systems, applications and scheduler policy
System Model Class Diagram
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 25
WebMDS� Customizable Web based interface for WS MDS information
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 26
GridFTP
� Based on FTP (File Transfer Protocol)
� GSI security on control and data channels
� Supports third-party transfers
� Improved efficiency of transfers
� Modification of TCP buffer sizes
� Parallel transfers (multiple TCP streams)
� Striped transfers
� Command line tools: globus-url-copy, gcp
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 27
GridFTP Performance
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 28
Reliable File Transfer (RFT)
� Manages a set of third-party GridFTP transfers
� Uses a database to checkpoint transfer state
� Recovers from� Source/destination server failures
� Network failures
� Container failures
� Transfers retried with exponential backoff
� Resumes transfers where they left off
� Command line tools: rft, rft_delete
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 29
WS GRAM
� Web Services version of the Grid Resource
Allocation and Management protocol
� Provides a single standard interface for remote job
submission and resource management
� Requires users and application developers to learn
only one method to gain access to a large variety of
local management systems
� Command line tools: globusrun-ws
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 30
Condor-G
� Developed at the University of Wisconsin-Madison
� An extension of Condor that makes use of Globus
services to submit jobs to different sites
� Matchmaking functionality to match jobs with
appropriate resources
� Available for use in WestGrid by request
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 31
DAGMan
� Part of the Condor software
� It manages workflows which are directed acyclic
graphs (DAGs), ensuring that jobs with
dependencies are executed in the correct order
� Available for use in WestGrid by request
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 32
Use Case: Life3D Simulations
� Example to illustrate the
benefits of workflow
management
� 3-dimensional version
of The Game of Life
� Workflow includes
simulation, rendering
and animation
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 33
Life3D - Workflow - I
Life3D
Simulation
Rendering
AnimationStage
Data
Stage
Data
Stage
Data
Stage
Data
gridstore gridstorelattice
grc15
octarine
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 34
Life3D - Workflow - II
WestGridGrid Research Center
gridstore (SFU)
lattice (UofC)
grc15
octarine
1.
2. Life3D Simulation
3.
4. Rendering5.
6. Animation
7.
Data Storage
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 35
Life3D - Technologies Used
� Grid Middleware
� GT2
� Data Management
� GridFTP
� Execution Management
� Job Submission - GRAM
� Meta-scheduler – Condor-G
� Workflow Manager - DAGMan
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 36
Use Case: Confederation Bridge ICE
Force Monitoring Project
� Monitoring of forces on the Confederation Bridge
� Data analyzed by civil engineering groups at University of Calgary and Carleton University
� GRC developed solution to automate data management as part of a CANARIE AAP project
(http://www.confederationbridge.com) (http://www.confederationbridge.com)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 37
ICE Force - Technologies Used
� Grid Middleware
� GT4
� Data Management
� PDMS
� Data Transfer - GridFTP, RFT
� Replication Management – RLS
� Metadata Management - MCS
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 38
Use Case:
Molecular Dynamics Simulations
� GROMACS
� Parallel molecular dynamics
simulation application
� Can simulate hundreds to
millions of particles
� Simulation runs can take days,
weeks or months
� Issues with long running jobs
� Fault tolerance
� Scheduler policy constraints
(http://moose.bio.ucalgary.ca/)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 39
GROMACS: Grid Enabled Solution
� Automated grid enabled solution developed by
GRC to manage GROMACS simulations as part
of a CANARIE AAP project
� Long jobs split into a series of shorter jobs
� Automates checkpointing, migration and
reconfiguration of jobs
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 40
GROMACS: Portal
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 41
GROMACS - Technologies Used� Grid Middleware
� GT4
� Information Services� WS MDS
� Data Management� PDMS (GridFTP, RFT, RLS, MCS)
� Execution Management� Custom system (Condor-G, GRAM)
� Portal� GridSphere
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 42
Use Case: Fire Simulation� Developed a comprehensive
environment for the Fire Dynamics Simulator (FDS) as part of a collaborative project between GRC and HP Labs
� Deployed on HP Labs Data Centre at University of Calgary
� Initial focus of project� Leverage Web 2.0 technologies
� Explore use of virtualization in a utility computing environment
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 43
Fire Simulation - Technologies Used
� User level� Web 2.0 interface (Facebook)
� Service provider level� LAMP environment (Linux, Apache, MySQL,
Perl/Python/PHP)
� Simulation (FDS, Condor)
� Visualization (Smokeview, VNC)
� Resource (utility) provider level� Virtualization (Xen)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 44
Use Case: Ecosystem Modelling
� ecosys
� An application that models ecosystems (agriculture,
forests, savannah, grassland, tundra, desert)
� Used to study ecosystem behavior under different
environmental conditions (Dr. Robert Grant – UofA)
� Individual experiments consist of several hundred
simulations with common and run specific input files
� GRC is developing a portal/experiment engine to
automate execution of experiments as part of a
Alberta Cyberinfrastructure (Cybera) pilot project
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 45
ecosys - Technologies Used� Grid Middleware
� GT4
� Information Services� WS MDS
� Data Management� GridFTP, Stork, SRB
� Execution Management� Custom system (Condor-G, GRAM)
� Portal� Web 2.0 based portal (with GSI authentication)
� Virtualization� Xen (in HP Labs Data Centre)
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 46
Summary� Grid computing technologies enable sharing of resources
across administrative domains
� They are aimed at improving accessibility to resources, enabling large scale collaborations, providing computing on demand and making access to resources transparent
� There are a large variety of technologies supporting security, VO management, information services, data management, execution management, portals and virtualization
� WestGrid supports various technologies based on the GT4 grid middleware
� Developing grid solutions is not easy but there can be substantial benefits
WestGrid Seminar Series Feb. 6, 2008
Introduction to Grid Technologies - 47
Contact Information
Cameron Kiddle
(WestGrid) http://www.westgrid.ca/
(GRC) http://grid.ucalgary.ca/