102
Basic Grid Basic Grid Projects Projects Sathish Vadhiyar Sathish Vadhiyar Sources/Credits: Project web pages

Basic Grid Projects

  • Upload
    gaille

  • View
    21

  • Download
    0

Embed Size (px)

DESCRIPTION

Basic Grid Projects. Sathish Vadhiyar. Sources/Credits: Project web pages. Condor. Condor Motivation. Most of the cycles (70%) of workstation pools are underutilized High throughput computing – Large amounts of processing capacity over long periods of time - PowerPoint PPT Presentation

Citation preview

  • Basic Grid ProjectsSathish VadhiyarSources/Credits: Project web pages

  • Condor

  • Condor MotivationMost of the cycles (70%) of workstation pools are underutilizedHigh throughput computing Large amounts of processing capacity over long periods of timeIn contrast to High Performance ComputingSupport system with distributed ownershipsOwners specify access policies

  • Condor FeaturesSpecialized workload management system Provides a job queueing mechanism, scheduling policy, resource monitoring, and resource managementCan effectively harness wasted CPU power from otherwise idle desktop workstations Can checkpoint and migrate a job to a different machine

  • Condor Architecture daemons / processesmasterStartdRepresents a machine to the Condor poolImplement owners access control policiesStarts, stops, suspends jobsRuns on executing machinesStarterSpawned by startd for a jobCoordinates with the job

  • Condor Architecture - daemonsScheddRepresents jobs to the condor poolMaintain persistent queues of users requestsRuns on submit machinesShadowSimilar to starter functionality, but runs on submit machineJob specificManagerCollectorCollects machine and resource information from all other daemonsAnswers queriesNegotiatorRetrieves information from collectorDoes match making

  • Job Submission Steps

  • Idle capacity utilizationWhen owner returns, Condor checkpoints and migrates jobs

  • classadsLanguage used in CondorFor describing jobs, workstations, and other resourcesMapping from attribute names to expressions Used by condor central manager to decide on job scheduling

  • StepsStep 1 entities express their characteristics through classads, constraints and ranks for constraints and preferences. Other properties accesses by field otherStep 2 matchmaker matches different classadsStep 3 match maker notifies matched entitiesStep 4 matched entities establish allocation

  • ClassAds

  • More on ClassAdsResource owners and customers can dynamically define own models suitable for distributed settingMatching and claiming as 2 distinct operations5 components of matchmaking protocolclassAd specificationadvertising protocolmatchmaking algorithmmatchmaking protocolclaiming protocolConstraints, i.e. queries, may be expressed as attibutes of classAdclassAd definition mapping from attribute names to expressions

  • Examples

  • Examples

  • ClassAds Steps

  • CheckpointingCheckpointing is used to vacate job from one idle workstation to anotherA checkpoint library linked with the programs codeStores unix process states including text, stack, data segments, files, pointers etc.Uses simple mechanisms including setjmp and longjmp.Also provides periodic checkpointingWorks only with homogeneous systems

  • CheckpointingFor scheduling and fault toleranceCheckpoint library uses signalsCheckpoint contains the process's data and stack segments, information about open files, pending signals, and CPU state.Checkpoints either stored on local disk of submitting machine or on checkpoint serversAlso provides a user interfaceTransparent checkpointingRemote files obtained from shadow process agent during migration

  • DAGManMeta scheduler for CondorManages dependencies between jobs at a higher levelSits on top of CondorInput of one program depends on the other

  • Example input file for DAGMan# Filename: diamond.dag #Job A A.condorJob B B.condorJob C C.condorJob D D.condor

    Script PRE A top_pre.cshScript PRE B mid_pre.perl $JOBScript POST B mid_post.perl $JOB $RETURNScript PRE C mid_pre.perl $JOBScript POST C mid_post.perl $JOB $RETURNScript PRE D bot_pre.csh

    PARENT A CHILD B CPARENT B C CHILD D

    Retry C 3

  • References / Sources / CreditsCondor manualCondor web pagesMichael Litzkow, Todd Tannenbaum, Jim Basney, and Miron Livny, "Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System", University of Wisconsin-Madison Computer Sciences Technical Report #1346, April 1997.James Frey, Todd Tannenbaum, Ian Foster, Miron Livny, and Steven Tuecke, "Condor-G: A Computation Management Agent for Multi-Institutional Grids", Proceedings of the Tenth IEEE Symposium on High Performance Distributed Computing (HPDC10) San Francisco, California, August 7-9, 2001.Rajesh Raman, Miron Livny, and Marvin Solomon, "Matchmaking: Distributed Resource Management for High Throughput Computing", Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, July 28-31, 1998, Chicago, IL.Michael Litzkow, Miron Livny, and Matt Mutka, "Condor - A Hunter of Idle Workstations", Proceedings of the 8th International Conference of Distributed Computing Systems, pages 104-111, June, 1988.

  • GlobusOpen source toolkit used for building GridsSoftware forSecurity (GSI)Information infrastructure (MDS)Resource management (GRAM, job manager, gatekeeper)Data management (GridFTP, DataGrid)Communication (Nexus)Fault detection, andPortability Now moving to web services - OGSA

  • TimelineI-WAY experiment 1994Formal beginning - 19961st version 1997Version 1.0 19982.0 20023.0 latestShow GT2 history powerpoint

  • GT4 Planned architecture

  • Grid Security Infrastructure (GSI)Supports security across organizations. Not centrally managedSingle sign-on delegation of credentialsDigital signatures based on public key cryptography for verification of messages

  • Verification of messages / digital certificatesMessageHash(message)Encyrpted hashEncypted hash + messageHash1 = hash(Message)Hash2 = decrypt hashIf Hash1 = Hash2 ?

  • GSIEvery resource identified by a certificate.Certificate provided and signed by CA.Certificate = resource identity + public key of resource + certificate authority + digital signature of CAUses SSL for mutual authenticationParties trust CAs possess CAs public keys

  • Mutual AuthenticationI want to communicate. This is my certificateABCADid CA sign the certificate or is the certificate tempered? Verify digital signatureOK. CA signed the certificate.Are you really A or did you steal the certificate from A? Send a random message

  • Authentication with Proxy and delegationEncrypted file for storing private keys. Needs passphraseProxy and delegation - More convenience and less securityAlso for dynamic delegation and dynamic entitiesOwner signs proxy certificateProxys private key are stored in unencrypted files since proxies are for short durationsChain of trust is established

  • Mutual Authentication with ProxyProxys certificate. As certificateAs proxyBFirst validate proxys certificate and then owners certificate

  • GSS APIGSI implemented on top of GSS-APIGSS API provides both transport and mechanism independence.Provides functions for obtaining credentials, performing authentication, signing messages and encrypting messagesGSI X.509 public key certification, public key infrastructure, SSL protocol, X.509 proxy certificates

  • X.509 Proxy CertificatesTo allow users to:Create identities for new entities dynamically and light-weightDelegate privileges to those entities dynamicallyPerform single sign-onAllows for the reuse of existing protocolsProxy certificateSubject name (identity) scoped by the subject name of the issuer subject name of the issuer + RDN (Relative Distinguished Name) + serial numberPublic key different from subjects public keyPCI Proxy Certificate Information policy method identifier + policy field

  • Proxies

  • Single sign-on and Proxies

  • Delegation over Network

  • Globus 3.2 - currentWS complies with OGSI 1.0New component CAS (Community Authorization Service), XIOOther components: WS RFT (Reliable File Transfer Service), Grid FTP, RLS (Replica Location Service), GRAM / WS MJS (Managed Job Service) / job manager

  • GridFTPGSI and Kherberos security on control and data channels with various levels of confidentiality and integrityMultiple data channels for parallel transfers using multiple TCP streams in parallel to improve aggregate bandwidthPartial file transfersThird-party (direct server-to-server) transfers by adding GSSAPI security to the existing third-party data transfers in FTP standard transfers between 2 servers mediated by a third-party clientGSSAPI operations authenticate the third party to the source and destination machines of data transferAuthenticated data channels Reusable data channels Command pipelining Striped data transfersAutomatic negotiation of TCP buffer/window sizes2 libraries:globus_ftp_control_library implements control channel APIgobus_ftp_client_librray implement GridFTP APIPlugin mechanisms for fault tolerance, performance monitoring, and extended data processing

  • RFT (Reliable File Transfer)Treat movement of multiple files as a single jobAccept transfer requests and reliably manage requestsOGSI compliantTo transfer data reliably between two GridFTP serversUses Grid Service Handles (GSH)Acts as a proxy for the user, acts as client on users behalf for third-party transfers

  • RFTClient submits SOAP description of data transfer jobMaintains checkpoints in data basesSupports both push and pull mechanisms

  • GRAMGRAM simplifies the use of remote systems by providing a single standard interface for requesting and using remote system resources for the execution of "jobs". The most common use (and the best supported use) of GRAM is remote job submission and control. This is typically used to support distributed computing applicationsFor remote job submission and resource management

  • GRAMProvides interfaces to local job scheduling mechanismsProvides mechanisms to map GSI identities to local user accountsProcesses the requests for resources for remote application execution, allocates the required resources, and manages the active jobs.also returns updated information regarding the capabilities and availability of the computing resources to the Metacomputing Directory Service (MDS).provides an API for submitting and canceling a job request, as well as checking the status of a submitted job. The specifications are written by the user in the Resource Specification Language (RSL), and is processed by GRAM as part of the job request.

  • GRAMA Gatekeeper runs on the remote hostCreates jobmanager for the jobGatekeeper:mutually authenticates with the client, maps the requestor to a local user, starts a job manager on the local host as the local user, and passes the allocation arguments to the newly created job manager. Jobmanager:Common componentMachine-specific component

  • GRAM RSL attributes(directory=value) (executable=value)(arguments=value [value] [value] ...)(jobType=single|multiple|mpi|condor)(count=value) (hostCount=value)(two_phase=) (restart=)

  • DUROC RSL attributesLabelresourceManagerContactsubjobCommsTypesubjobStartType

  • Example(executable = a.out)(directory = /home/nobody )(arguments = arg1 "arg 2")(count = 1)

  • DUROCDynamically-Updated Request Online Coallocatorcoallocator is used to coordinate transactions with each of the RMs and bring up the distributed pieces of the job+(&(resourceManagerContact=RM1)(count=3)(executable=myprog.sparc)) (&(resourceManagerContact=RM2)(count=2)(executable=myprog.rs6000))

  • WS GRAMA set of OGSI compliant services that provide remote job execution (Master) Managed Job Factory Service (MJFS) Managed Job Service (MJS) File Stream Factory Service (FSFS) File Stream Service (FSS) Resource Specification Language (RSL-2) schema is used to communicate job requirements Remote jobs run under local users account Client to service credential delegation is done user to user, *not* through a third party

  • RSL-2 ExampleGNS = http://www.globus.org/namespaces

  • Managed Job (Factory) Service Defines an OGSI/GWSDL interface for submitting, monitoring and controlling a job MJS uses the File Stream Factory Service to manage the jobs stdout and stderr file streaming MJS exposes the stdout and stderr File Stream Factory Grid Service Handles (GSH) in Service Data Element

  • The MJS instances can monitor jobs in two ways: Resource Information Provider Service (RIPS) A specialized notification service Maintains job information from the scheduler Scheduler info provider outputs queue and job data in XML Poll the scheduler directly Only option for FORK MJS to Resource Interface: can support custom-schedulers through well defined templates

  • WS GRAM Architecture

  • MDSMonitoring and directory serviceFor publishing and accessing system and application dataCan restrict access to MDS information by using GSIInteracts with local information services hour-glass mechanismProvides caching to minimize transfer of upto-date information and lessen network overhead

  • MDS

  • MDSMDS Data:System information: architecture, OSNetwork informationLoad statusInformation viewed through LDAP browser, web browser,

  • MDSIntegrates existing systems while providing uniform and extensible data modelSupport for multiple information service providersUniform APIAdopts data representation and API from LDAP directory service

  • MDSContains entries that represent instances of objectsEntries form a DITEach entry is associated with one or more attribute:value pairsEach entry associated with a distinguished name.Object class are associated with entries for object types

  • Distinguished name example

  • Distinguished names for Networks

  • MDS Unique FeaturesMultiple information providers information providers specified on a per attribute basisScope of data can be definedInformation can be cached

  • Object ClassEach DIT entry has an object classObject class defines what attributes an entry has and what kind of values the attributes take.

  • OGSA and WS MDS Index serviceStandard interfaces for Grid services in the form of WSDL porttypesGridService porttype for querying and updating GridService dataMDX index service consists of following interfaces:Factory for creating a grid service instance and return GSHGSH to refer to a grid service instanceGSR describes how a client can communicate with a grid serviceQuery query language supportRegistry - Supports discovery by returning the GSHs of a set of Grid servicesNotification for registering interest in a service

  • GRAMDesigned to address 5 major problems:Site autonomy (resource managers)Heterogeneous substrate (resource managers)Co-allocation (co-allocators)Policy extensibility (RSL and resource brokers)Online control (RSL and resource brokers)

  • GRAM

  • GRAM

  • RSL spec.E.g.:

    Multi-request

  • Local resource managersProcesses RSL specificationsEnables resource monitoring and managementPeriodically updates MDS

  • GRAM

  • GRAMGatekeeper:Performs mutual authentication of user for resourceMaps remote user to local userStart jobmanager as local user

  • Advanced reservation and co-allocation - GARA

  • Data GridChallenges:Petabytes and terabytes of dataQuery management to this huge dataCache managementProviding gigabit/sec QoSCoscheduling data transfers and computationSelection of dataset replicasMaximize use of scarce storage, computation and network resources

  • Data Grid Architecture

  • Data GridNeed for meta-data servicesVarious kinds:Application metadataReplica metadataSystem configuration metadataReplica managementFor better performance or availability to accessesMainly for access to published resources read-only model

  • Application requirements:A reliable secure high-performance data transfer protocolManagement of multiple copies of files and collections of files

  • Globus Replica Management ArchitectureFunctions:

    Architecture:Lower level replica catalog APIHigher level replica management API

  • Replica catalogProvides mapping between logical names of files/locations and physical objects on storage systemsStores 3 kinds of entriesLogical collection user defined collections of files file aggregationLocation entries physical locations of filesLogical files globally unique namesReplica catalog API provides operations on the replica catalogReplica management API provides session management, catalog creation, file maintenance, access controlImplemented with LDAP

  • Replica Selection in Globus Data Grid (Vazhkudai et al.)Replica selection uses MDS for information regarding characteristics of storage systemsLDAP information organized as DIT (Directory Information Tree)Each storage resource in Data Grid incorporates GRISLDAP can execute shell scripts in the background to obtain various dynamic entities like availableSpace, mountPoint etc.Static attributes like seek times can be entered by the system administratorAttributes like data transfer rates across networks to clients can be obtained based on past performance, i.e., historical dataClassAds can also be used for expressing storage attributes

  • Directory for Storage GRIS

  • stepsApplication queries metadata expressing desiring characteristics of logical filesA logical file is returnedApplication queries replica catalog for replica instances for the logical fileStorage broker helps to choose a particular replica

  • Storage Architecture stepsApplication presents classAds regarding replica requirements to SBSB does search:Queries replica catalogs with the list of all replicasQueries individual GRIS of replicas about their characteristicsCollects all information and proceeds to matchingMatch:Converts replica capabilities to replica classAdsMatches application classAds to replica classAdsAccesses file using GridFTP

  • Replica CatalogProvide mappings between logical names for file or collections and one or more copies of those objects on physical systemsServices provided by replica catalog:Registering a list of files as a logical collectionRegistering the physical location of a complete or partial replica of a logical collectionRegistering information about a particular logical file in a logical collectionModifying the contents of registered entities of the catalogResponding to queries of the catalogThe Globus Replica Catalog supports replica management by providing mappings between logical names for files and one or more copies of the files on physical storage systems

  • Replica managementGlobus Replica Management integrates the Globus Replica Catalog (for keeping track of replicated files) and GridFTP (for moving data) and provides replica management capabilities for data grids.The globus_replica_management library provides client functions that allow files to be registered with the replica management service, published to replica locations, and moved among multiple locations. Managing the copying and placement of files in a distributed computing system so as to improve the performance of data analysis

  • Replica management service - functionsRegistration of files with the replica management serviceCreation and deletion of replicas of previously registered filesEnquiries concerning the location and performance characteristics of replicas.Replica selection based on performance characteristics

  • Replica managementReplica management API combines storage system operations with calls to low-level catalog API functionsReplica management system controls where and when copies are created and provides information about copiesBut does not ensure file consistency

  • RM APISession managementSession handles and attributesRestartRollbackCatalog creation and file managementCreating catalog entriesregistering filesPublishing filesCopying, deleting filesFuture ideasIncorporating advance researvationAutomatic replica selection and creationData grid projectshttp://www.globus.org/datagrid/projects.html

  • Globus References / sources / creditsA Resource Management Architecture for Metacomputing Systems. K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke. Proc. IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing, pg. 62-82, 1998. Describes the resource management architecture implemented as part of the Globus system.A Distributed Resource Management Architecture that Supports Advance Reservations and Co-Allocation. I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, A. Roy. Intl Workshop on Quality of Service, 1999. Describes the new Globus Architecture for Reservation and Allocation, which integrates CPU and network QoS. The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke. Journal of Network and Computer Applications, 23:187-200, 2001 (based on conference publication from Proceedings of NetStore Conference 1999).Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing. B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke. IEEE Mass Storage Conference, 2001. Presents the design and performance characteristics of two fundamental technologies for data management.Replica Selection in the Globus Data Grid. S. Vazhkudai, S. Tuecke, I. Foster. Proceedings of the First IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2001), pp. 106-113, IEEE Computer Society Press, May 2001. Discusses a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives.

  • Globus References / sources / creditsA Security Architecture for Computational Grids. I. Foster, C. Kesselman, G. Tsudik, S. Tuecke. Proc. 5th ACM Conference on Computer and Communications Security Conference, pp. 83-92, 1998. Describes techniques for authentication in wide area computing environments.http://www.globus.org/Security/papers/pki04-welch-proxy-cert-final.pdfA National-Scale Authentication Infrastructure. R. Butler, D. Engert, I. Foster, C. Kesselman, S. Tuecke, J. Volmer, V. Welch. IEEE Computer, 33(12):60-66, 2000. Describes our experience designing, developing, and deploying the Grid Security Infrastructure.

  • Idle capacity

  • Condor-GGoal: to allow user harness multi-domain resources as if they are in single domainSecurity and resource access in multi-domain environments Management of computation and harnessing of resources within a single administrative domain Inter-domain resource management protocols of the Globus Toolkit Intra-domain resource and job management methods of CondorCondor : job management part of Condor + resource management part of CondorCondor-G: job management part of Condor + resource management part of GlobusCondor-G to execute jobs outside Condor pool

  • Condor-GUses GSI for authenticationUses GRAM for remote job submission and monitoringUses GASS for file transferUses RSLLimitations no checkpoints, file transfers, match making

  • Condor-GSecurity, resource discovery, resource access by Globus in multi-domain environments Management of computation and harnessing of resources within a single administrative domainDemonstration: 653 procs., 950000 CPU hours, at different sites for numerical optimization3 issues remote access (Globus), computation management (Condor), remote execution environment (Condor)Condor-G capabilities submit jobs, query a jobs status, notification of job termination, access to detailed logs

  • Condor-GFault tolerance: 4 types crash of job manager, crash of machine hosting gatekeeper and job manager, crash of machine hosting gridmanager and crash of network between gridmanager and gatekeeperCondor-G had wrappers for credential management, refreshing proxies etc.Scheduling is really simple let the users specify, no matchmaking

  • Example Condor submit description file for DAGMan Example# Filename: diamond_job.condor #executable = /path/diamond.exeoutput = diamond.out.$(cluster)error = diamond.err.$(cluster)log = diamond_condor.loguniverse = vanillanotification = NEVERqueue