Upload
timothy-brock
View
40
Download
0
Embed Size (px)
DESCRIPTION
The Use of Condor in the gLite Grid Middleware. Erwin Laure Condor Week 14-15 March 2005. Contents. Overview on EGEE and gLite gLite and Condor Future plans. The EGEE Project. EU funded (2 years until March 2006) - PowerPoint PPT Presentation
Citation preview
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
The Use of Condor in the gLite Grid Middleware
Erwin Laure
Condor Week
14-15 March 2005
Condor Week 2005 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Contents
• Overview on EGEE and gLite
• gLite and Condor
• Future plans
Condor Week 2005 3
Enabling Grids for E-sciencE
INFSO-RI-508833
The EGEE Project• EU funded (2 years until March 2006)
• EGEE offers the largest production grid facility in the world open to many applications (HEP, BioMedical, generic)
• Existing production service based on LCG
• Next generation open source web-services middleware being re-engineered taking into account production/ deployment/ management needs
• Well-defined, distributed support structure to provide eInfrastructure that is available to many application domains
• Middleware Activity– Re-engineer and harden Grid middleware– Provide production quality middlewarewww.eu-egee.org
Network infrastructure
(GÉANT )
Op
era
tio
ns
, S
up
po
rt
an
d t
rain
ing
Collaborations
Global Grid
Condor Week 2005 4
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Activities
• 48 % service activities (Grid Operations, Support and Management, Network Resource Provision)
• 24 % middleware re-engineering (Quality Assurance, Security, Network Services Development)
• 28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation)
Emphasis in EGEE is on operating a productiongrid and supporting the end-users
Condor Week 2005 5
Enabling Grids for E-sciencE
INFSO-RI-508833
Country providing resourcesCountry anticipating joining
In LCG-2: 113 sites, 30 countries >10,000 cpu ~5 PB storage
Includes non-EGEE sites:• 9 countries• 18 sites
Computing Resources: Feb 2005
Condor Week 2005 6
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Grid Middlewareguiding principles
• Service oriented approach– Allow for multiple interoperable implementations
• Lightweight (existing) services – Easily and quickly deployable– Use existing services where possible
Condor, EDG, Globus, LCG, … • Portable
– Being built on Scientific Linux and Windows• Security
– Sites and Applications• Performance/Scalability & Resilience/Fault Tolerance
– Comparable to deployed infrastructure• Co-existence with deployed infrastructure
– Co-existence with LCG-2 and OSG (US) are essential for the EGEE Grid services
• Site autonomy– Reduce dependence on ‘global, central’ services
• Open source license
Condor Week 2005 7
Enabling Grids for E-sciencE
INFSO-RI-508833
From Development to Product
• Fast prototyping approach– Small scale testbed (initially CERN and
Wisconsin)
• Single out individual components for deployment on pre-production service (originally LCG-2/EGEE0 based)
• These components need to go through integration and testing
– To ensure they are deployable and basically work
LCG-2 (=EGEE-0)
prototyping
prototyping
product
20042004
20052005
LCG-3 EGEE-1
product
Condor Week 2005 8
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Services and Responsible Clusters
Grid AccessService
API
Access Services
JobProvenance
Job Management Services
ComputingElement
WorkloadManagement
PackageManager
MetadataCatalog
Data Services
StorageElement
DataManagement
File & ReplicaCatalog
Authorization
Security Services
AuthenticationAuditing
Information &Monitoring
Information & Monitoring Services
ApplicationMonitoring
Site Proxy
Accounting
JRA3 UK
CERN IT/CZ
Condor Week 2005 9
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Services for Release 1
Grid AccessService
API
Access Services
JobProvenance
Job Management Services
ComputingElement
WorkloadManagement
PackageManager
MetadataCatalog
Data Services
StorageElement
DataManagement
File & ReplicaCatalog
Authorization
Security Services
AuthenticationAuditing
Information &Monitoring
Information & Monitoring Services
Application
Monitoring
Site Proxy
Accounting
JRA3 UK
CERN IT/CZ
Focus on key services
Release date is M
arch 31st 2005
Condor Week 2005 10
Enabling Grids for E-sciencE
INFSO-RI-508833
Condor and gLite
• Design team including representatives from Middleware providers (AliEn, Condor, EDG, Globus,…) including US partners produced middleware architecture and design.– Takes into account input and experiences from applications,
operations, and related projects – DJRA1.1 – EGEE Middleware Architecture (June 2004)
https://edms.cern.ch/document/476451/
– DJRA1.2 – EGEE Middleware Design (August 2004) https://edms.cern.ch/document/487871/
• Wisconsin is one of the sites of the development prototype– Using Condor pool as backend– Using Globus RLS
• Use VDT distribution of Condor and Globus
Condor Week 2005 12
Enabling Grids for E-sciencE
INFSO-RI-508833
The current gLite CE
• Collaboration of INFN, Univ. of Chicago, Univ. of Wisconsin-Madison, and the EGEE security activity (JRA3)
LSF PBS/Torque
Condor
Gatekeeper
LCASLCMAPS
WSS
CEMon
Condor-CBlahpd
NotificationsLaunch
Condor-CLaunch
Condor-C
Submitjob
Localbatchsystem
CE
Condor Week 2005 13
Enabling Grids for E-sciencE
INFSO-RI-508833
• Efficient and reliable data storage, movement, and retrieval on the infrastructure
• Storage Element– Reliable file storage (SRM based storage systems)– Posix-like file access (gLite I/O)– Transfer (gridFTP)
• File and Replica Catalog– Resolves logical filenames (LFN) to physical location of files
(URL understood by SRM) and storage elements– Hierarchical File system like view in LFN space– Single catalog or distributed catalog (under development)
deployment possibilities• File Transfer and Placement Service
– Reliable file transfer and transactional interactions with catalogs– Stork being evaluated
• Data Scheduler– Scheduled data transfer in the same spirit as jobs are being
scheduled taking into account e.g. network characteristics (collaboration with JRA4)
– Under development• Metadata Catalog
– Limited metadata can be attached to the File and Replica Catalog
– Interface to application specific catalogs have been defined
Data Management Services
SRM
I/O Gri
dFT
P
Storage Element
TransferAgent
FPS
Data Scheduler
FPSFPSFPSVOs
Site boundary
LocalCat
Catalog
MO
M
CatalogVOs
LocalCat
Catalog
MO
M
LocalCat
Catalog
MO
M
CatalogCatalog
Condor Week 2005 14
Enabling Grids for E-sciencE
INFSO-RI-508833
Evolutions foreseen in 2005
Here follows a list of main topics we still need to address, details andother topics need to be worked out with operations and applications
• WMS– WS Interface
Better support for bulk job submission
• CE– Head node monitoring
Guard and if necessary pause/resume services running on the head node
– SUDO service Currently one Condor-C instance needed per user Condor-C should run under a VO user and submit jobs via a sudo service to
the local batch system
– This is being done in collaboration with Condor and Globus
Condor Week 2005 15
Enabling Grids for E-sciencE
INFSO-RI-508833
Evolutions foreseen in 2005
• Catalogs– Distributed and single deployment options
• FTS/FPS– Channel management– Clarify the role of Stork
• Data Scheduler– “broker” for data transfer “jobs”
• R-GMA Information system– Web service version
• Package Manager
Second major g
Lite re
lease
foreseen at the end of 2
005
Condor Week 2005 16
Enabling Grids for E-sciencE
INFSO-RI-508833
Summary
• Contributions of Condor to gLite span whole process– Design – prototyping – product
• International collaborations in Grid middleware are essential– Grid middleware cannot be developed within a single corner
Effective exchange of ideas, requirements, solutions and technologies Coordinated development of new capabilities Open communication channels Early detection of differences and disagreements
– Condor link to OSG is very important to gLite– A common integration and testing project is being defined
• gLite process, in particular through the strong interactions between European (EGEE) and US (Condor and Globus) projects is a first step towards truly international collaborative middleware development