Upload
ross-powell
View
215
Download
2
Embed Size (px)
Citation preview
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
gLite Overview
Jean Salzemann CNRS/IN2P3 ACGRID School,Hanoi (Vietnam) November 5th, 2007
Credits: Charles Loomis, Mike Mineter, Giuseppe Andronico, Alex Villazon, and other EGEE collegues…
Glite Overview – J. Salzemann – 05/11/2007 2
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE
• EGEE = Enabling Grids for E-sciencE
Two projects of 2 years: EGEE I and EGEE II
over 70 leading institutions in more than 40 countries, federated in regional Grids
Currently40.000 CPUs
5 Petabytes (5 Mio. GB) storage
~200 Virtual Organizations (VO)
Glite Overview – J. Salzemann – 05/11/2007 3
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite – Grid middleware
• The Grid relies on advanced software – the middleware - which interfaces between resources and the applications
• The GRID middleware
Finds convenient places for the application to be executed
Optimises use of resources
Organises efficient access to data
Deals with authentication to the different sites that are used
Run the job & monitors progress
Transfers the result back to thescientist
Glite Overview – J. Salzemann – 05/11/2007 4
Enabling Grids for E-sciencE
INFSO-RI-508833
Glite Legacy
EDG
Globus Condor
LCG Alien
GLite
...
Pan-European testbed.
Complete, functional set of services.
Significant productions demonstrated.
Subset of EDG services.
Improved robustness, scalability.
Worldwide production service.
Re-engineering:
Robustness
Standard interfaces
Expanded set of services.
Glite Overview – J. Salzemann – 05/11/2007 5
Enabling Grids for E-sciencE
INFSO-RI-508833
“Desktop Grid” Architecture
Peer-to-peer architecture (BOINC, XtremeWeb)
Volatile resources.
Limited security (client identifies server).
Lightweight infrastructure.
Handles limited types of resources.
App. DB
...
Resource
Resource
User pull task
...
Glite Overview – J. Salzemann – 05/11/2007 6
Enabling Grids for E-sciencE
INFSO-RI-508833
LCG Architecture
submit
Broker
...
Resource
Resource
InfoSys.
User
submit
publish descriptionsquery
Batch-like architecture.
Stable, well-maintained resources.
Secured via Public Key Infrastructure (PKI)
Heavy support infrastructure.
Can handle large range of resources.
Glite Overview – J. Salzemann – 05/11/2007 7
Enabling Grids for E-sciencE
INFSO-RI-508833
Service Oriented Architecture
Existing LCG system is largely service-oriented.
EGEE evolving to a clean SOA:
standard interfaces
standard technologies
Client Service
Registry
query
locationpublish description
interaction
Glite Overview – J. Salzemann – 05/11/2007 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Convergence of Technologies
• Web Services– Clean, complete specification of service APIs.
– Supported technology: Good support within commercial sector. Adequate support within open-source community.
– Very active ➔ proposed standards rapidly evolving.
• EGEE Service Evolution– Plain web services:
Avoid “proprietary” protocols and interfaces. Fairly stable, will ease further evolution.
– Adopt WSRF and/or WS-* standards as appropriate.
• Expect user-visible changes in APIs.
Glite Overview – J. Salzemann – 05/11/2007 9
Enabling Grids for E-sciencE
INFSO-RI-508833
Middleware face to face
LCG• Security
– GAS (Grid Access Service)• Job Management
– Condor + globus– CE, WN– Logging & Bookkeeping
• Data Management– LCG services
• Information & Monitoring– BDII
• Grid Access– CLI + API
gLite• Security
– VOMS• Job Management
– Condor + blahp– CE, WN– Logging & Bookkeeping– Job Provenance
• Data Management– LFC– AMGA
• Information & Monitoring– R-GMA– Service Discovery
• Grid Access– CLI + API + Web Services
Glite Overview – J. Salzemann – 05/11/2007 10
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite – Overview
• gLite
First release 2005 (currently gLite 3.1)
Next generation middleware for grid computing
Developed from existing components (globus, condor,..)
Intended to replace present middleware with production quality services
Interoperability & Co-existence with deployed infrastructure
Robust: Performance & Fault tolerance
Open Source license
Glite Overview – J. Salzemann – 05/11/2007 11
Enabling Grids for E-sciencE
INFSO-RI-508833
The Grid stack
• Application layer– Grid programs
• Collective layer– Resource Co-allocation– Data Replica Management
• Resource layer– Resource Management– Information Services– Data Access
• Connectivity layer– Grid Security Infrastructure– High-performance data transfer protocols
• Fabric layer– the hardware: computers (parallel, clusters..), data storage servers
Application
Fabric
Connectivity
Resource
Collective
InternetTransport
Application
Link
Glite Overview – J. Salzemann – 05/11/2007 12
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Grid Middleware Services
API
Access
Workload Mgmt Services
ComputingElement
WorkloadManagement
MetadataCatalog
Data Management
StorageElement
DataMovement
File & ReplicaCatalog
Authorization
Security Services
Authentication
Information &Monitoring
Information & Monitoring Services
Application
Monitoring
Connectivity
Accounting
Auditing
JobProvenance
PackageManager
CLI
Glite Overview – J. Salzemann – 05/11/2007 13
Enabling Grids for E-sciencE
INFSO-RI-508833
User Interface (UI)User Interface (UI): The place where users logon to the Grid
Computing Element (CE)Computing Element (CE): A batch queue on a site’s computers where the user’s job is executed
Storage Element (SE)Storage Element (SE): provides (large-scale) storage for files
Resource Broker (RB)Resource Broker (RB): Matches the user requirements with the available resources on the Grid
Main components
Information SystemInformation System: Characteristics and status of CE and SE (Uses “GLUE schema”)
Glite Overview – J. Salzemann – 05/11/2007 14
Enabling Grids for E-sciencE
INFSO-RI-508833
Current production middleware
ReplicaReplicaCatalogueCatalogue
Logging &Logging &Book-keepingBook-keeping
ResourceResourceBrokerBroker
StorageStorageElementElement
ComputingComputingElementElement
Information Information ServiceService
Job Status
DataSets info
Author.&Authen.
Job S
ub
mit
Even
t
Job
Qu
ery
Job
Stat
us
Input “sandbox”
Input “sandbox” + Broker Info
Output “sandbox”
Output “sandbox”
Pu
blis
h
SE & CE info
““User User interface”interface”
Glite Overview – J. Salzemann – 05/11/2007 15
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: Workload Management System
(WMS)• Job Management Services related to job
management/execution– Computing Element
job management (submission, control, …) information about characteristics and status Actual execution is done in a Worker Node (WN)
– Workload Management core component (see next slides)
– Job Provenance keeps track of job definition, execution conditions, environment important points of the job life cycle
• debugging, post-mortem analysis, comparision of job execution
– Package Manager extension of a traditional package management system to a grid
• automates the process of installing, upgrading, configuring and removing software packages from a shared area on a grid site
Glite Overview – J. Salzemann – 05/11/2007 16
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: WMS architecture
Glite Overview – J. Salzemann – 05/11/2007 17
Enabling Grids for E-sciencE
INFSO-RI-508833
Information Services
• Maintains information about hardware, software, services and people participating in a Virtual Organization– Should scale with the Grid´s growth
“Find a computer with at least 2 free CPUs and with 10GB of free disk space...”
• Globus MDS (Monitoring and Discovery System)– Hierarchical, push based
(pull based) showed limitations
SNMP
GRIS
NIS
NWS
LDAP
MDS API
…
GIIS…
DataModel
Glite Overview – J. Salzemann – 05/11/2007 18
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: Information System - BDII
• Berkely Database Information Index (BDII)– A Monitoring and Discovery Service (MDS) evolution
– Based on LDAP (Lightweight Directory Access Protocol)
– Central system Queries servers/providers about status Stores the retrieved information in a database Provides the information following the GLUE Schema
• Commands lcg-infosites –vo <your_vo> all l ce l se l lfc l lfcLocal l –is <your_bdii>
[gliteui] /home/martin > lcg-infosites --vo dpsgltb all –is glitece.dps.uibk.ac.at#CPU Free Total Jobs Running Waiting ComputingElement---------------------------------------------------------- 2 2 0 0 0 glitece.dps.uibk.ac.at:2119/blah-pbs-dpsgltbAvail Space(Kb) Used Space(Kb) Type SEs----------------------------------------------------------3172384 4664832 n.a gliteio.dps.uibk.ac.at
Glite Overview – J. Salzemann – 05/11/2007 19
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: Information System - R-GMA
• Relational Grid Monitoring Architecture (R-GMA)– Developed as part of the EuropeanDataGrid Project (EDG)
– Now as part of the EGEE project
– Based on the Grid Monitoring Architecture (GMA)
• Uses a relational data model– There is no central repository, only a “Virtual Database”
– Schema is a list of table definitions Additional tables/schema can be defined
– Registry is a list of data producers with all its details
Schema RegistryVirtual table
Cons
Prod ... Prod
... Cons
Glite Overview – J. Salzemann – 05/11/2007 21
Enabling Grids for E-sciencE
INFSO-RI-508833
Resource Management• Everything (or anything) is a resource
– Physical or logical (single computer, cluster, parallel, data storage, an application...)
– Defined in terms of interfaces, not devices
• Each site must be autonomous (local system administration policy)
• Grid Resource Allocation Manager (GRAM)– Defines resource layer protocols and APIs that enable clients to
securely instantiate a Grid computational task (i.e. a job)
– Secure remote job submissions
– Relies on local resource management interfaces
GRAM
LSF PBSLL SGE
Glite Overview – J. Salzemann – 05/11/2007 23
Enabling Grids for E-sciencE
INFSO-RI-508833
Data Management: Protocols
• Data access and transfer– Simple, automatic multi-protocol file transfer tools:
Integrated with Resource Management service Move data from/to local machine to remote machine, where the job
is executed (stagein – stageout) Redirect stdin to a remote location Redirect stdout and stderr to the local computer Pull executable from a remote location
– To have a secure, high-performance, reliable file transfer over modern WANs: GridFTP
Glite Overview – J. Salzemann – 05/11/2007 24
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: Data management - Services
• Catalog– File and Replica Catalog– File Authorization Service– Metadata catalog– Distribution of catalogs, conflicts resolution
• Storage Elements (SE)– SRM (Storage Resource Manager) interface– Transfer protocols (gsiftp, rfio, …)
Catalog
SESE
SESE
SE
Glite Overview – J. Salzemann – 05/11/2007 25
Enabling Grids for E-sciencE
INFSO-RI-508833
File management in gLite
• Files are write-once, read-many– If users edit files then they manage the consequences!
• Middleware supporting– Replica files
• to be close to where you want computation• For resilience
– Logical filenames – Catalogue: maps logical name to physical storage device/file– Virtual filesystems,
POSIX-like I/O
• Services provided: – storage – transfer – catalogue that maps logical filenames to replicas.
Glite Overview – J. Salzemann – 05/11/2007 28
Enabling Grids for E-sciencE
INFSO-RI-508833
Security
• Basic security:– Authentication: Who we are on the Grid?– Authorization: Do we have access to a resource/service?– Protection: Data integrity and confidentiality
• but, there are thousands of resources over different administration domains...: – Single sign-on, i.e. give a password once, and be able to access all
resources (to which we have access)
• Grid Security Infrastructure (GSI):– Grid credentials: digital certificate and private key
Based on Public Key Infrastructure (PKI). X.509 standard Certification Authority (CA) signs certificates. Trust relationship
– Proxy certificates: Temporary self-signed certs, allowing single sign-on: Proxy delegation
CA User Proxy Proxysign sign sign. . .
Glite Overview – J. Salzemann – 05/11/2007 29
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid resources (A)
Grid resources (B)
Conventional grid security
Certification Authority (CA)BobCert request
User Interface (UI)
Bob´s Grid certificate
Sysadmin A :- Create user “grid1“- Map Bob´s certificate to “grid01“
Sysadmin B :- Create user “user001“- Map Bob´s certificate to “user001“
- Single sign-on- Delegation through proxy certificate
- Manual user “mapping“- No info about VOs
grid-proxy-init
Glite Overview – J. Salzemann – 05/11/2007 30
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid resources (A) Grid resources (B)
gLite – Enhanced security in gLite
Certification Authority (CA)Bob
Cert request
User Interface (UI)
Bob´s Grid certificate
VO Database
VO Service
VO Manager
VO membership request
VO
VO Account
Pool
VO Account
Pool
Automatic mappingfor Bob
Automatic mappingfor Bob
voms-proxy-init
Glite Overview – J. Salzemann – 05/11/2007 31
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: VOMS
• Virtual Organization Membership Service (VOMS)
– EGEE/gLite enhancement for VO management
Provides information on user's relationship with Virtual Organization (VO)
Membership
Group membership
Roles of user
Multiple VOUser can register to multiple VOs and create an aggregate proxy
Access ressources in every registered VO
Backward compatibilityExtra VO related information in users proxy certificate
Users proxy can still be used with non VOMS-aware services
Glite Overview – J. Salzemann – 05/11/2007 32
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite: VOMS - Web interface
• Requires a valid certificate from a recognized CA imported on the browser
• VO user can
Query membership details
Register himself in the VONeeds a valid certificate
Track his requests
• VO manager can
Handle requests from users
Administer the VO
• Everybody can
Get information about the VO
Glite Overview – J. Salzemann – 05/11/2007 33
Enabling Grids for E-sciencE
INFSO-RI-508833
QUESTIONS ??