17
1 CENTER FOR PARALLEL COMPUTERS DEPARTMENT OF COMPUTING SCIENCE http://www.sgas.se DEPARTMENT OF COMPUTING SCIENCE Enforcing resource allocations with the SweGrid Accounting System (SGAS) European Grid Conference (EGC), Amsterdam February 15, 2005 Peter Gardfjäll Umeå University [email protected] Joint effort with Erik Elmroth (Umeå University) Lennart Johnsson (KTH) Olle Mulmo (KTH) Thomas Sandholm (KTH)

1 CENTER FOR PARALLEL COMPUTERS DEPARTMENT OF COMPUTING SCIENCE DEPARTMENT OF COMPUTING SCIENCE Enforcing resource allocations with

Embed Size (px)

Citation preview

1

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Enforcing resource allocations with the SweGrid Accounting System

(SGAS)

European Grid Conference (EGC), Amsterdam February 15, 2005

Peter Gardfjäll Umeå [email protected]

Joint effort withErik Elmroth (Umeå University)Lennart Johnsson (KTH)Olle Mulmo (KTH)Thomas Sandholm (KTH)

2

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Outline

• Background Grid accounting SGAS in SweGrid

• SGAS Architecture Components

• SGAS demo

3

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Grid accounting - tracking Grid resource usageMaintaining a (consistent) Grid-wide view of the resources utilized by VO members

• Measure and control users’ total resource usage on the Grid Assuming absence of central point of control Resource owners should retain local control

4

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Why accounting?

• Accounting information can be used for several purposes Economic compensation Tracking of resource usage Evaluation/forecasting of resource usage Resource brokering decisions Assign scheduling priorities to jobs based on previous

resource utilization Pricing & creating economic markets for resource

sharing Enforcement of resource allocations Etc…

5

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

SGAS in SweGrid

• SweGrid is a Swedish computational Grid Connects six computer clusters (Umeå, Göteborg,

Uppsala, Stockholm, Lund, Linköping) with a total of 600 processors

• Swedish National Allocation Committee Allocates CPU time (measured in node hours) on

SweGrid to research projects Grid-wide allocations can be spent arbitrarily among

Grid sites

• SGAS has been developed to Enforce project allocations across all SweGrid sites

• Prevent project members from overspending

Store detailed information on each Grid job’s resource usage

6

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

SweGrid Accounting System (SGAS)

• Decentralized resource allocation enforcement system

• SGAS performs soft real-time enforcement of allocations Real-time enforcement: Resources can, at the time of job

submission, deny access if project quota has been used up Soft: enforcement is subject to local resource policies (strict

enforcement not always appropriate)

• Primarily targeted towards allocation enforcement in SweGrid Not restricted to SweGrid use

• Developed with an emphasis on easy integration into different Grid environments In SweGrid: deployed on top of NorduGrid middleware

7

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

SGAS (cont.)• Service-oriented architecture

main components exposed as Web Services• Java implementation

Based on Web and Grid services technology Globus Toolkit 3 (GT3) primitives

• Based on open Grid standards (OGSA, GGF-UR) • Transparent to (most) end-users

Single account fully transparent• Single-point-of integration• Flexible and customizable

No assumptions about the types of resources accounted for Abstract “currency” – Grid credits Charge for arbitrary resource usage Resources transform usage into Grid credits before charging

account System can be configured by customizing policies on three

different levels• User: “only run jobs if sufficient quota is available”• Resource owner: “run quota-exceeding jobs with low priority”• Allocation authority: “allow 10 % account overdraft”

8

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

SGAS component overview

• Four main components Bank

• Online service• Manages project accounts (resource allocations)• Provides Grid users/resources with consistent information

about resources consumed by Grid projects JARM (Job Account Reservation Manager)

• Intercepts job requests on resources• Makes account reservation prior to job execution• Charges project account after job completion• Single-point-of-integration

LUTS (Logging and Usage Tracking Service)• Collects and publishes usage records which can be

queried by users PAT (Policy Administration Tool)

• Client tool to manage Bank and LUTS policies

9

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Component interactions 1. Contact resource2. Authenticate/authorize

(delegate credentials)3. Submit job request4. JARM intercepts request5. Make account reservation6. Run job7. Collect usage info8. Charge project account

and log usage info

10

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Bank component• Composed of three OGSA-compliant services

Bank Service• Creates and locates Account services

Account Service• Represents a project’s resource allocation• Authorized users make soft-state reservations on

account allocation. If granted it results in a ...

Hold Service• Time-limited reservation on the account

• Overdraft policy can be associated with each account

• Each account manages a set of time-stamped allocations Each allocation valid for a limited time period Allows total allocation to be spread out in time Implements a "use-it-or-lose-it" policy

Bank

Account

HoldHold

Hold

<<creates>>

<<creates>>

11

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Allocation strategy example

Jan-99 Feb-99 Mar-99 Apr-99 May-99 Jun-99 Jul-99

Alloc6

Alloc5

Alloc4

Alloc3

Alloc2

Alloc1

.

10,000 NH

10,000 NH

10,000 NH

10,000 NH

10,000 NH

10,000 NH

Picture from: http://www.emsl.pnl.gov/docs/mscf/gold/

12

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Logging and Usage Tracking Service (LUTS)

• Collects and publishes usage records compliant with GGF-UR specification XML-based format for storing detailed information about the

resources consumed by Grid jobs• CPU time, memory, storage, network, …

• Authorized users are allowed to run XPath queries directly against LUTS

• URs can be extended to hold additional information only understood by a subset of users/resources without modifying LUTS

• URs can be logged in batches Improved performance and scalability

• XSLT-based transformation infrastructure to allow sites to easily convert their non-XML usage data to a UR-compliant format

13

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Job Account Reserv. Manager (JARM)

• Integration-point between SGAS and underlying Grid env. Decoupled from workload manager

• NorduGrid integration configuration of plug-in scripts called from NG job submission

state machine

• Plugged into workload manager at each cluster Makes account reservations prior to job execution

• Done in parallel with job preparation (to incur less overhead)

Collects usage data from batch system when job has finished Converts usage into Grid credits, charges account and logs a

usage record in LUTS• Charging & logging of jobs usually deferred and performed in batches

• Local site policies can be enforced by overloading the default Site Policy Manager Default Site Policy Manager

• let job through even if bank cannot be reached; log and charge later• overdraft violation detected: run job with lower priority

14

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

SGAS demo

• Self-contained demo Runs out of the box

• Can be downloaded from the SGAS website Download and try it out!

• Runs on Windows and Unix/Linux Requires Java (JDK/JRE 1.4.2)

• Runs against actual Bank and LUTS services Service container runs embedded in demo

• A sample run

15

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Project milestones & future directions• Sep 2003: SGAS project initiated• Sep 2003: SweGrid site survey• Oct 2003: SGAS white paper

Investigated existing work on Grid accounting Presented an accounting sys architecture proposal for SGAS

• Jan 2004: Finished proof-of-concept prototype• Feb 2004: Started working on production code base• Apr 2004: Version alpha 0.1 was released• Nov 2004: Version beta 0.2 was released• very soon: Version 1.0 release

Improved stability/scalability Simplified installation process Improved administration client

• Spring/summer 2005: Planned GT4/WSRF transition GT4 scheduled for release in April 2005

16

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Authorization framework• Fine-grained authorization framework (contributed to Globus)

Authorization specified on a per-operation basis• Associate authorization policy and engine with service

Authz policy managed through ServiceAuthzManagement porttype• Implemented by several SGAS services (Bank, Account, LUTS)

Service orthogonal: transparent to service implementation Customizable: allows different backend engines/policy languages

• Current authorization engine based on XACML Not SGAS-specific

17

CENTER FORPARALLEL

COMPUTERS

DEPARTMENT OFCOMPUTING SCIENCE

http://www.sgas.se

DEPARTMENT OFCOMPUTING SCIENCE

Policy enforcement overview

Broker SchedulerWorkloadManager

plugin

JARM

Bank LUTS

User

Site PolicyManager

SGAS

Cluster(resource)

ExternalAuthorizationServices

PAP

PIP

PEP

PDP

Admininferface

Membership/CommunityService

PIP

PIP

PDP

PDP

SGAScomponent

Externalcomponent

Genericinterface

PAP = Policy Administration Point - set up policiesPIP = Policy Information Point - retrieve policiesPDP = Policy Decision Point - make policy decisions/manage policyPEP = Policy Enforcement Point - intercept request and query PDP(s)