Upload
benjamin-neal
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
The EDGeS project receives Community research funding
1
Bridging EGEEBridging EGEEto BOINC and XtremWebto BOINC and XtremWeb
GIN : From interoperation to GIN : From interoperation to interoperabilityinteroperability
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 2
Bridging EGEE to BOINC and XtremWebGIN : From interoperation to interoperabilityFrom interoperation to interoperability
OverviewOverview
• Definitions : Service Grids and Desktop GridsDefinitions : Service Grids and Desktop Grids• Presentation of the EDGeS projectPresentation of the EDGeS project• Bridge BOINC Bridge BOINC EGEE EGEE• Bridge XtremWeb Bridge XtremWeb EGEE EGEE• Bridge EGEE Bridge EGEE BOINC BOINC• Bridge EGEE Bridge EGEE XtremWeb XtremWeb• Architecture of the EDGeS 3G BridgeArchitecture of the EDGeS 3G Bridge• Desktop Grid Production InfrastructureDesktop Grid Production Infrastructure• OGF standards used for future interoperabilityOGF standards used for future interoperability
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 3
Bridging EGEE to BOINC and XtremWeb SG = Service Grid SG = Service Grid = = Managed grid of managed computing Managed grid of managed computing clustersclusters
Grid User
X509 proxy
X509 proxy with VOMS extensions
Submits Job with X509 proxy
Publishes available Resources
Pushes Job
Log Log
VOMS Admin
Manages VO
Site Admin
Manages Site
Accesses Data with X509 proxy
Accesses Data with X509 proxy
Gives Job Status
Gives Accounting and Auditing
VOMS Server
AccountingLogging & Bookkeeping
Meta-scheduler(WMS)
SiteComputing Resource
SiteStorage Resource
Grid Admin
Sends back Output Sandbox Sends back
Output Sandbox
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 4
Bridging EGEE to BOINC and XtremWeb SG = Service Grid SG = Service Grid = = Managed grid of managed computing Managed grid of managed computing clustersclusters
• Computing and Storage Resources are managed by trained Computing and Storage Resources are managed by trained staff inside Sites and are authenticated by X509 certificates.staff inside Sites and are authenticated by X509 certificates.
• Users are authenticated by X509 certificates or proxies.Users are authenticated by X509 certificates or proxies.• Users belong to VOs and get a X509 proxy from a VOMS Users belong to VOs and get a X509 proxy from a VOMS
server to :server to :–– Access data, Access data,–– Submit jobs. Submit jobs.
• Executables are NOT authenticated.Executables are NOT authenticated.So trust is primarily between Sites and VOs.So trust is primarily between Sites and VOs.
• Order of magnitude is typically 100 000 CPUs.Order of magnitude is typically 100 000 CPUs.• A meta-scheduler (WMS) A meta-scheduler (WMS) pushespushes the jobs to resources with the jobs to resources with
are both suitable and available.are both suitable and available.
Examples : EGEE, NorduGrid, OSG, DEISA, …Examples : EGEE, NorduGrid, OSG, DEISA, …
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 5
Bridging EGEE to BOINC and XtremWeb DG = Desktop Grid DG = Desktop Grid = = Loose grid scavenging idle resourcesLoose grid scavenging idle resources
Unit of Work = Application + Input DataUnit of Work = Application + Input Data
Grid User
Submits input data for an application
Requests Unit of Work
Sends Unit of Work
Application Manager
Certifies Application
Resource Owner(often volunteer)
Owns Resource
Sends back results
Accepts or Refusesan applicationon his resource
Grid Server withApplicationRepository
Computing Resource
(often Desktop Computer)Sends back results
Currently, for BOINC, both roles of ‘Application Manager’ and ‘Grid User’ are fulfilled by ‘BOINC Project Owners’.
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 6
Bridging EGEE to BOINC and XtremWeb DG = Desktop Grid DG = Desktop Grid = = Loose grid scavenging idle resourcesLoose grid scavenging idle resources
• Computing and Storage Resources are owned by various Computing and Storage Resources are owned by various Owners (it is often volunteer computing), but they are NOT Owners (it is often volunteer computing), but they are NOT managed and NOT authenticated.managed and NOT authenticated.
• Grid Servers are authenticated by a X509 certificate.Grid Servers are authenticated by a X509 certificate.• Users are authenticated by the Grid Servers, but NOT by the Users are authenticated by the Grid Servers, but NOT by the
Computing and Storage Resources.Computing and Storage Resources.• Executables are certified by managers of the Grid Servers.Executables are certified by managers of the Grid Servers.So :So : –– Resource Owners have to trust the Grid Servers, Resource Owners have to trust the Grid Servers,
–– BOINC sends each Work Unit to several ResourceBOINC sends each Work Unit to several Resource Owners, because BOINC does NOT fully trust them. Owners, because BOINC does NOT fully trust them.
• Order of magnitude can be 1 000 000 CPUs.Order of magnitude can be 1 000 000 CPUs.• Starving Computing Resources Starving Computing Resources pullpull Work Units from Grid Work Units from Grid
Servers.Servers.
Examples : BOINC, XtremWeb, xGridExamples : BOINC, XtremWeb, xGrid
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 7
Bridging EGEE to BOINC and XtremWeb
Presentation of the EDGeS projectPresentation of the EDGeS project
New FP7 project New FP7 project started on 01/01/2008started on 01/01/2008
• Integrate Service Grids Integrate Service Grids and Desktop Gridsand Desktop Grids
• Enable very large Enable very large number of computing number of computing resources resources (100K-1M processors)(100K-1M processors)
• Attract new scientific Attract new scientific communitiescommunities
• Provide a Grid Provide a Grid application application development development environmentenvironment
• Provide application Provide application repository and bridges repository and bridges for the execution in the for the execution in the SG-DG systemSG-DG system
WLCG (CERN)
EDGeS
gLite(EGEE)
ARC(NorduGrid)
Boinc(Berkeley)
XtremWeb(INRIA/IN2P3)
Xgrid(Apple)
Unicore(DEISA)
VDT(OSG)
Current
Future
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 8
Bridging EGEE to BOINC and XtremWeb
Presentation of the EDGeS projectPresentation of the EDGeS project
http://www.edges-grid.euhttp://www.edges-grid.eu
Now, Interoperation :Now, Interoperation :• Ad-hoc bridges and interfaces between EGEE, BOINC and XtremWeb.Ad-hoc bridges and interfaces between EGEE, BOINC and XtremWeb.• A MoU between EDGeS and EGEE has been signed on 23 Sept 2008.A MoU between EDGeS and EGEE has been signed on 23 Sept 2008.• XtremWeb users must have a X509 certificate, be registered in a VO XtremWeb users must have a X509 certificate, be registered in a VO
and submit their Jobs with a X509 proxy.and submit their Jobs with a X509 proxy.• BOINC Project Owners must have a X509 certificate, be registered in BOINC Project Owners must have a X509 certificate, be registered in
a VO and store a medium-term X509 proxy in a MyProxy server.a VO and store a medium-term X509 proxy in a MyProxy server.• All files must be transferred through the Input and Output All files must be transferred through the Input and Output
sandboxes.sandboxes.
In the future :In the future :• Interoperability using OGF standards, in order to bridge more Grids.Interoperability using OGF standards, in order to bridge more Grids.• Better support of grid file access Better support of grid file access (GFAL, lcg_utils and GridFTP)(GFAL, lcg_utils and GridFTP)..
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 9
EGEE
WMS
EDGeS 3G bridge
EGEE Plugin
1 for each (BOINC Project Owner, EGEE VO) pair
Queue Manager & Job DB
BOINC Handler1 for each (BOINC server,
BOINC Project Owner, EGEE VO) triple
Bridging EGEE to BOINC and XtremWeb
Bridge BOINC Bridge BOINC EGEE EGEE (WU = Work Unit)(WU = Work Unit)
WUi+1
WUi+2
WUi+3
Jobi+1
Jobi+1
Jobi+2
BOINC Server
Work Unit
BOINC Project Owner
Submission
MyProxyMedium term X509 proxy
Config. file
Credential access
information
Short term X509 proxy
VOMS Server
VOMS extensions
Job
H
andl
er
In
terf
ace
Grid
H
andl
er
Int
erfa
ce
BOINC jobwrapper client (simulating
a large BOINC computing resource)
3G job-wrapper
3G job-wrapper
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 10
Bridging EGEE to BOINC and XtremWeb
Bridge BOINC Bridge BOINC EGEE EGEE
Solution = Inside EDGeS bridge, marshalling of theSolution = Inside EDGeS bridge, marshalling of the BOINC Work Units into Job collections BOINC Work Units into Job collections
• For each (BOINC server, BOINC Project Owner, EGEE VO) For each (BOINC server, BOINC Project Owner, EGEE VO) triple, a separate Job Handler collects the BOINC Work Units triple, a separate Job Handler collects the BOINC Work Units and pand place them in a queue.lace them in a queue.
• For each (BOINC Project Owner, EGEE VO) pair, a separateFor each (BOINC Project Owner, EGEE VO) pair, a separate EGEE plugin :EGEE plugin :– Retrieves a short term X509 Proxy for the BOINC Project Owner from a
MyProxy server, and VOMS extensions from a VOMS server,
– Periodically processes new Work Units found in the queue :• It converts each Work Unit into an EGEE Job,• In order to reduce the usage of the EGEE WMS, it uses Collection possibili-
ties of EGEE to submit many Jobs in one request described using JDL.
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 11
EGEEEGEE
Bridging EGEE to BOINC and XtremWeb
Bridge XtremWeb Bridge XtremWeb EGEE EGEE
XtremWeb User
X509 proxy X509 proxy
with VOMS extensions
Submits User Job with X509 proxy
Sends back Job Status and Results
VOMS Server
XtremWeb Server
Submits mono-user Pilot Job with X509 proxy
Gives Pilot Job Status
gLite WMS Computing Element
Pushes Pilot job
Mono-user Pilot Job
Requests only 1 User Job
Sends 1 User Job with same
X509 proxy
User Job
Gives Pilot Job Status
Sends back results directly
XtremWeb Bridge
Requests User Jobs
Sends User Jobs with X509 proxy
Manages User Job status
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 12
Bridging EGEE to BOINC and XtremWeb
Bridge XtremWeb Bridge XtremWeb EGEE EGEE
Solution = XtremWeb bridge : Gliding with a mono-user Pilot JobSolution = XtremWeb bridge : Gliding with a mono-user Pilot Job1.1. A XtremWeb User submits to the XtremWeb server his User Job with a X509 A XtremWeb User submits to the XtremWeb server his User Job with a X509
proxy.proxy.
2.2. At the request of the XtremWeb bridge, the XtremWeb server sends him the At the request of the XtremWeb bridge, the XtremWeb server sends him the User Job with the X509 proxy.User Job with the X509 proxy.
3.3. The XtremWeb bridge submits to a gLite WMS a mono-user Pilot Job with this The XtremWeb bridge submits to a gLite WMS a mono-user Pilot Job with this X509 proxy (job description in a X509 proxy (job description in a JDLJDL).).
4.4. The gLite WMS pushes the Pilot Job to a Computing Element, which executes it.The gLite WMS pushes the Pilot Job to a Computing Element, which executes it.
5.5. The mono-user Pilot Job requests 1 User Job from the XtremWeb server, and The mono-user Pilot Job requests 1 User Job from the XtremWeb server, and stops itself if it receives none.stops itself if it receives none.
6.6. The XtremWeb server verifies that the requested User Job has a X509 proxy, and The XtremWeb server verifies that the requested User Job has a X509 proxy, and sends the User Job and the X509 proxy to the Pilot Job.sends the User Job and the X509 proxy to the Pilot Job.
7.7. The Pilot Job verifies that the received X509 proxy is the same as its own X509 The Pilot Job verifies that the received X509 proxy is the same as its own X509 proxy, and executes the User Job.proxy, and executes the User Job.
8.8. At the end of the User Job, the Pilot Job sends the Job results directly to the At the end of the User Job, the Pilot Job sends the Job results directly to the XtremWeb server, then stops itself.XtremWeb server, then stops itself.
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 13
BOINC Server
EGEE
LCG-CE
for EDGeS
EDGeS3G bridge
Bridging EGEE to BOINC and XtremWeb
Bridge EGEE Bridge EGEE BOINC BOINC
Gets EXE
Reports resourcesand performance
Checks EXE
Adds jobWatches
job
X509 proxywith VOMS extensions
BOINC
Computing
Resource
BOINC plugin (DC-API)
EDGeS
Application
Repository
Information
provider
GRAM Job
Manager
for EDGeSEGEE VOMS
EGEE User
Queue Manager
& Job DB
Generic Job WS Handler
BOINC Service
Watches
Pushes jobSubmits Job
Logs events
Logs events
EGEE
BDII
gLite
WMS
EGEE LB
Sends output
Gets output
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 14
Bridging EGEE to BOINC and XtremWeb
Bridge EGEE Bridge EGEE BOINC BOINC
Solution = Installation of a Solution = Installation of a LCG-CELCG-CE sending the EGEE Jobs to the sending the EGEE Jobs to the EDGeS bridge marshalling them into BOINC Works EDGeS bridge marshalling them into BOINC Works
UnitsUnits
• Publish information to the BDII according to Publish information to the BDII according to GLUE 1.3GLUE 1.3
• EGEE producerEGEE producer– New GRAM job manager– Gets job information from wrapper– Checks if exe is validated in the EDGeS application repository (GEMLCA)– Checks if exe is supported by attached BOINC– Gets files from WMS– Adds job to 3G bridge job Database– Polls status of jobs in 3G bridge job Database– Gets results from 3G bridge and uploads to LB
• BOINC BOINC pluginplugin ( (DC-APIDC-API))– Use DC-API to generate BOINC WUs– Jobs are read from the 3G bridge DB– 3G DB entries are updated on events– The plugin has already been implemented for the CancerGrid system
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 15
EGEE
LCG-CE for
XtremWeb
XtremWeb
Server
Bridging EGEE to BOINC and XtremWeb
Bridge EGEE Bridge EGEE XtremWeb XtremWeb
SolutionSolution
Inside a Inside a LCG-CELCG-CE, , installation of a installation of a GRAMGRAM jobmanager to jobmanager to marshal the EGEE Jobs marshal the EGEE Jobs into XtremWeb Jobs.into XtremWeb Jobs.
Gets EXE
Watches
Reports resourcesand performance
Pushes job
Checks EXE
Adds jobWatches
job
Submits Job
Logs events
X509 proxy with VOMS extensions
Logs events
XtremWeb
Computing
Resource
EDGeS
Application
Repository
EGEE
BDII
gLite
WMSInformation
provider
GRAM Job
Manager for
XtremWeb
EGEE LB
EGEE VOMS
EGEE User
Sends output
Gets output
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 16
EDGeS 3G Bridge
EGEEEGEE
Bridging EGEE to BOINC and XtremWeb
Architecture of the EDGeS 3G Architecture of the EDGeS 3G BridgeBridge
Job
Han
dler
In
terf
ace
JobDatabase
Queue Manager
Grid
Han
dler
In
terf
ace
BOINC Plugins
(DC-API)
EGEEPlugins
Scheduler
Handler for received jobs
Storage for received jobs
Generic interface above grid plugins
Grid plugin (submit jobs, update status, get output, ...)
Control path WU Job
BOINCHandlers
EGEEHandler
Generic handler for received jobs
LCG-CE for EDGeS
gLite WMS
BOINC Server
User
Job with
X509 proxy
Work Unit
BOINC Server
gLite WMS
WU
Job
Control path Job WU
Job
WU
XtremWeb Plugins XtremWeb
Server
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 17
XtremWeb Desktop GridsBOINC Desktop Grids
Bridging EGEE to BOINC and XtremWeb
Desktop Grid Production Desktop Grid Production InfrastructureInfrastructure
EGEE
EDGeS 3G bridge
Local DGUoW Grid 1.500 PCs
Public DG Extremadura Grid
70.000 PCs
Public DG EGEE@homePlanned 10.000 PCs
Public DG SZDG
30.000 PCs
Public DG AlmereGrid3.000 PCs
Public DGEGEE XtremWeb
1.000 PCs
Public DG INRIA Grid300 PCs
Local DG IN2P3 Grid
200 PCs
EGEE User gLite WMS
BOINC plugin (DC-API)
LCG-CE for BOINC
LCG-CE for XtremWeb
Job
Authors : Z. Balaton, G. Caillat, Z. Farkas, G. Fedak, G. Gombas, P. Kacsuk, A. Kornafeld, J. Kovacs, H. He, O. Lodygensky, A. Marosi, E. Urbah
v2.13 18
Bridging EGEE to BOINC and XtremWeb
OGF standards used for future OGF standards used for future interoperabilityinteroperability
• GLUE 2.0GLUE 2.0 in order to in order to publish information to the BDII :publish information to the BDII :Needs implementation by gLite.Needs implementation by gLite.
• BESBES to receive Job submissions : to receive Job submissions : For example from GridSphere Portal.For example from GridSphere Portal.
• BESBES to submit Jobs : to submit Jobs :Needs availability of CREAM CE.Needs availability of CREAM CE.
• JSDLJSDL to describe Jobs : to describe Jobs :Needs implementation by gLite.Needs implementation by gLite.
Potentially :Potentially :• AUTHZAUTHZ for Authentication / Authorization for Authentication / Authorization• UR, RUSUR, RUS for Job logging and accounting for Job logging and accounting• ByteIO, SRM, GridFTP, DMIByteIO, SRM, GridFTP, DMI to manage data transfers to manage data transfers• ACSACS for the for the GEMLCAGEMLCA application repository application repository• SAGA, DRMAASAGA, DRMAA for the methodology of application development for the methodology of application development