21
Architecture and Environment for Enabling Interactive Grids Vanish Talwar, Sujoy Basu, Raj Kumar HP Laboratories Palo Alto HPL-2004-217 November 24, 2004* E-mail: [email protected] access control, data management, dynamic accounts, Grid computing, interactive sessions, QoS, resource management Traditional use of grid computing allows a user to submit batch jobs in a grid environment. We believe, next generation grids will extend the application domain to include interactive graphical sessions. We term su ch grids Interactive Grids. In this paper, we describe some of the challenges involved in building Interactive Grids. These include fine grain access control, performance QoS guarantees, dynamic account management, scheduling, and user data management. We present the key architectural concepts in building Interactive Grids viz. hierarchical sessions, hierarchical admission control, hierarchical agents, classes of dynamic accounts, application profiling, user data management, scheduling for interactive sessi ons, persistent environment settings, and exporting remote desktop. We also describe IGENV, an environment for enabling interactive grids. IGENV consists of GISH - 'Grid Interactive Shell', Controlled Desktop, SAC - 'Session Admission Control' module, GMMA - 'Grid Monitoring and Management Agents', and Policy Engine. We also present our testbed implementation for Interactive Grids using and extending Globus Toolkit 2.0 for the Grid middleware infrastructure, and VNC as the remote display technology. * Internal Accession Date Only Published in the Journal of Grid Computing, volume 1, issue 3, pp 231-250, 2003 Approved for External Publication Copyright 2004 Kluwer Academic Publishers

Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

Architecture and Environment for Enabling Interactive Grids Vanish Talwar, Sujoy Basu, Raj Kumar HP Laboratories Palo Alto HPL-2004-217 November 24, 2004* E-mail: [email protected] access control, data management, dynamic accounts, Grid computing, interactive sessions, QoS, resource management

Traditional use of grid computing allows a user to submit batch jobs in a grid environment. We believe, next generation grids will extend the application domain to include interactive graphical sessions. We term such grids Interactive Grids. In this paper, we describe some of the challenges involved in building Interactive Grids. These include fine grain access control, performance QoS guarantees, dynamic account management, scheduling, and user data management. We present the key architectural concepts in building Interactive Grids viz. hierarchical sessions, hierarchical admission control, hierarchical agents, classes of dynamic accounts, application profiling, user data management, scheduling for interactive sessions, persistent environment settings, and exporting remote desktop. We also describe IGENV, an environment for enabling interactive grids. IGENV consists of GISH -'Grid Interactive Shell', Controlled Desktop, SAC -'Session Admission Control' module, GMMA -'Grid Monitoring and Management Agents', and Policy Engine. We also present our testbed implementation for Interactive Grids using and extending Globus Toolkit 2.0 for the Grid middleware infrastructure, and VNC as the remote display technology.

* Internal Accession Date Only Published in the Journal of Grid Computing, volume 1, issue 3, pp 231-250, 2003 Approved for External Publication Copyright 2004 Kluwer Academic Publishers

Page 2: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

Journal of Grid Computing 1: 231–250, 2003.© 2004 Kluwer Academic Publishers. Printed in the Netherlands.

231

Architecture and Environment for Enabling Interactive Grids

Vanish Talwar, Sujoy Basu and Raj KumarHewlett-Packard Labs, 1501 Page Mill Road, MS 1181, Palo Alto, CA 94304, USAE-mail: {vanish.talwar,sujoy.basu,raj.kumar}@hp.com

Key words: access control, data management, dynamic accounts, Grid computing, interactive sessions, QoS,resource management

Abstract

Traditional use of grid computing allows a user to submit batch jobs in a grid environment. We believe, nextgeneration grids will extend the application domain to include interactive graphical sessions. We term such gridsInteractive Grids. In this paper, we describe some of the challenges involved in building Interactive Grids. Theseinclude fine grain access control, performance QoS guarantees, dynamic account management, scheduling, anduser data management. We present the key architectural concepts in building Interactive Grids viz. hierarchicalsessions, hierarchical admission control, hierarchical agents, classes of dynamic accounts, application profiling,user data management, scheduling for interactive sessions, persistent environment settings, and exporting remotedesktop. We also describe IGENV, an environment for enabling interactive grids. IGENV consists of GISH – ‘GridInteractive Shell’, Controlled Desktop, SAC – ‘Session Admission Control’ module, GMMA – ‘Grid Monitoringand Management Agents’, and Policy Engine. We also present our testbed implementation for Interactive Gridsusing and extending Globus Toolkit 2.0 for the Grid middleware infrastructure, and VNC as the remote displaytechnology.

1. Introduction

Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users acrossgeographical and administrative boundaries, and as autility. Several efforts [3–7] are underway to archi-tect and deploy a middleware infrastructure for GridComputing. Commercial acceptance of Grid Comput-ing technology is also steadily increasing. Traditionaluse of Grid Computing has been for the executionof batch jobs in the scientific and academic commu-nity. We believe that next generation grids will extendthe application domain to include interactive sessions.Such sessions would allow the end-user to interac-tively submit interactive jobs to remote nodes in aGrid. The end-user will also be able to view the graph-ical and interactive output of the submitted jobs andapplications through such interactive sessions. Exam-ple use cases for such interactive sessions could be

for, but not limited to, graphics visualization appli-cations, engineering applications like CAD/MCAD,digital content creation, financial applications, officeapplications, e-mail applications, software develop-ment, command line interactions, streaming media,video games. We consider such Grids in an enterpriseenvironment provided either by Application ServiceProviders (ASPs) [8] or by in-house IT departments.In [8], we extended the ASP model to provide cus-tomers access to a remote computer’s desktop forinteractive use as a service. Some of the new require-ments for the design of such grids are: fine grainaccess control, performance QoS, dynamic accountmanagement, scheduling, user data management.

Most of the early work on Grids has been for batchjobs. Other work like [9] does not focus on providinginteractive job submission ‘sessions’, and [10] doesnot address the needs for graphical and multimediasessions. Existing thin client architectures [11, 12] useremote display technologies but they do not have an

Page 3: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

232

architecture in the context of Grids. Neither do theother related works provide a comprehensive frame-work for access control, QoS, scheduling, accountmanagement, and user data management for graphicalinteractive sessions in a Grid Computing environment.In this paper, we introduce Interactive Grids and de-scribe the architecture and runtime environment forenabling such Grids. The user of an Interactive Grid isgiven an execution node in the Grid for interactive use.Keyboard and mouse events are sent from the users’submission node to the remote execution node in theGrid, and the output of the applications is viewed bythe end-user using remote display technologies likeVNC [13]. Our key contributions are:1. Key architectural concepts for Interactive Grids

consisting of:(a) Hierarchical sessions,(b) Hierarchical admission control,(c) Hierarchical agents,(d) Classes of dynamic accounts,(e) Application profiling,(f) User data management,(g) Scheduling for interactive sessions in the Grid,(h) Persistent environment settings,(i) Exporting remote desktop.

2. A runtime environment for enabling interactivegrids providing for fine grain access control andQoS for graphical interactive sessions in a Grid,consisting of the following components:(a) GISH – a ‘Grid Interactive Shell’(b) Controlled Desktop(c) SAC – a ‘Session Admission Control’ module(d) GMMA – ‘Grid Monitoring and Management

Agents’(e) Policy Engine.The above components are tightly coupled to each

other, and are designed with grid-enabled features.Our architecture is proposed as an extension to

the existing Grid middleware infrastructure. Our im-plementation uses Globus Toolkit 2.0 [3] as this Gridmiddleware infrastructure. Our design is modular andpolicy based allowing for easy extensibility and easymanageability.

The rest of the paper is organized as follows. InSection 2, we present the requirements and issuesfor building Interactive Grids. Section 3 presents theoverall system architecture. In Section 4, we describeour runtime environment IGENV in detail. Sections 5and 6 describe our hierarchical agents and dynamicaccounts framework respectively. In Section 7, we

describe the data access and data affinity schedulingalgorithm. Section 8 presents discussion and analysis.In Section 9, we describe our implementation. Sec-tion 10 presents Related Work and we conclude inSection 11.

2. Requirements and Issues

In this section, we present below the requirementsand issues in architecting Grids enabled for interactivesessions.

Fine Grain Access Control. Interactive sessions al-low a malicious user to submit unauthorized jobs to theremote node and permit end-users unspecified time ofaccess to the remote node. Further, interactive sessionsallow a malicious user to probe for vulnerabilities inthe Grid system, and launch attacks against other re-mote nodes in the Grid system. A fine grain accesscontrol in the Grid context is needed.

Performance QoS. Interactive Grids permit end-users to request and launch interactive applications.Such applications are sensitive to response time andreal time performance requirements. There is a needto guarantee quality of service for such applications.

Dynamic Account Management. Interactive Gridspose a challenge in terms of account management forarbitrary end-users of the grid.

Scheduling. Interactive Grids require a wide areascheduling system that can perform (a) discovery ofresources, (b) matching of resources to user require-ments for graphical interactive sessions, (c) globaladmission control before the launching of graphicalinteractive sessions, (d) reservation of resources forthe desired usage time, as well as fine grained reser-vations like CPU, network bandwidth reservations,(e) resource assignment, (f) job dispatching, (g) globalsession state management.

User Data Management. There is a need to providepersistent storage for the user’s data. The data needsto be made available to the user on being allocated asession with ownership given to the dynamic accountassigned to him.

3. Overall System Architecture

Figure 1 shows the conceptual overview of an Inter-active Grid computing system that we are consider-ing. The system consists of heterogeneous executionnodes distributed across multiple administrative do-mains. These nodes are managed by a Grid Distributed

Page 4: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

233

Figure 1. Conceptual overview of the Interactive Grid computing system.

Resource Management system (DRM). An end-usersubmits a request through a submission node to theGrid DRM to obtain a remote execution node for in-teractive use. On receiving the request from the user,the Grid DRM selects a remote execution node basedon the session requirements, and reserves this node forthe requested duration of the session. The Grid DRMalso performs a reservation of fine grained resourceslike CPU, network bandwidth and storage bandwidth,for the users’ session. At the requested time, the DRMwould use remote display technologies like VNC [13]to establish an interactive desktop session betweenthis remote execution node and the end-user’s sub-mission node. The end-user then interacts directlywith this remote node,1 through the established ses-sion. During this session, the user can submit requestsdirectly to the remote node, to launch multiple appli-cations. These interactions could either be graphicalor text-based. We are more interested in addressingthe problem for graphical interactive sessions to re-mote nodes. However, the solutions being proposedand developed by us are also applicable for text-onlyinteractive sessions, as a special case. The interactionof the end-user with the remote node involves the exe-cution of both installed applications and user specifiedbinaries. It is also assumed that the user does not havea local account with the remote node apriori.

We propose hierarchical sessions in such an Inter-active Grid computing system consisting of global in-teractive sessions and per-application interactive ses-sions. A global interactive session constitutes a remotedesktop session using remote display technologieslike VNC [13]. A global session consists of many

1 The terms ‘remote node’ and ‘remote execution node’ are usedinterchangeably in this paper.

per-application sessions. A per-application interac-tive session constitutes the direct interaction betweenthe end-user and the application executing on the re-mote execution node. A per-application interactivesession occurs in the context of a global interactivesession. We are most interested in graphical applica-tion sessions. All the per-application sessions sharethe resources allocated to the global session in whichthey execute. The display for the application sessionis sent to the user’s submission nodes. The data forthe application session is accessed from the remotedata nodes. Such a hierarchical classification of globaland per-application sessions simplifies the manage-ment of remote display interactive applications in theGrid.2 Corresponding to global and per-applicationinteractive sessions, we introduce the notion of hierar-chical admission control in our framework consistingof a global admission control module at the GridDRM node, and a per-application session admissioncontrol module at the execution node. The globaland per-application session admission control modulesmake admission control decisions for global and per-application sessions respectively. For simplicity, werefer to the ‘per-application session admission controlmodule’ as just ‘session admission control module’.The following is the sequence of steps in such aproposed system:

1. The end-user creates a job request template for anew global interactive session, specifying the sta-tic resource requirements, session requirements,and the de sired list of applications to be launchedduring the global session. The user could also

2 The terms ‘global session’ and ‘global interactive session’,‘per-application session’ and ‘per-application interactive session’,are used interchangeably in the remainder of the paper.

Page 5: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

234

specify the desired QoS requirements for the re-quested applications. The job request is submittedto the Grid DRM node.

2. The request is received by a Grid Scheduler run-ning on the Grid DRM node. In the first pass, theGrid Scheduler performs a matching of resourcesin the Grid to satisfy the coarse requirements ofthe user, for example, matching of the hardwarerequirements of the user. The Grid middlewareprovides a distributed repository (like MDS [14]),where various resources can publish their ser-vices. The scheduler queries this repository todiscover resources that match with the user’s jobneeds.

3. In the next pass, the Grid Scheduler interfaceswith the Global Admission Control system, whichperforms the admission check for the requestedglobal interactive session [15]. Application pro-files determined through historical logs is used toguide the admission control process.

4. In the final pass, the Grid Scheduler selects thebest execution node for the global interactive ses-sion based on a resource assignment policy. Inthis pass, it only considers the nodes that satisfythe Global Admission Control test in step 3. Theresource assignment tables are updated to reflectthe current assignment of the users’ request to theselected execution node.

5. At the requested time, the selected execution nodeis allocated to the end-user, and the job dispatchersends the request for the new global interactivesession to the allocated execution node along withthe SLA for the session.

6. A dynamic account is created by the Dynamic Ac-count Manager for the new global interactive ses-sion. The Dynamic Account Manager maintainspools of dynamic accounts on each resource. Un-like normal user accounts which remain perma-nently assigned to the same real-world user, a dy-namic account is assigned to a user temporarily.

7. Appropriate reservations are made on the execu-tion node for the fine grained resources like CPU,network bandwidth, etc., as specified in the SLAfor the global session.

8. A configuration process configures the system forthe user before launching the global interactivesession. This involves, for example, setting up theapplications desired by the user, and customizingthe user’s desktop environment in accordance withthe policies for this user.

9. The users’s data is now retrieved from persis-tent storage and setup for the global session withowner-ship given to the dynamic account assignedto him. We also retrieve the users’ environmentsettings, e.g., shell environment settings from thepersistent repos itory and set it up appropriately.

10. We then start a remote display server (a VNCserver in our testbed) to export the remote desk-top to the end-user. A global interactive sessionis now initiated between the allocated executionnode and the end-users’ submission node. Sessionspecific monitoring and management agents arealso started.

11. The end-user can now request for new per-appli-cation interactive sessions directly through thestarted global interactive session.

12. The requests for per-application interactive ses-sions are verified by the runtime components GridInteractive Shell (GISH) and Controlled Desk-top for access control checks, and if successfulare passed onto the Session Admission Controlsystem on the execution node.

13. The Session Admission Control system performsan admission control check to determine if the re-quested per application session can be admittedinto the global interactive session. If not, the re-quest for new per-application session is denied.Else, the per-application session is started.

14. The Resource Management Monitoring Agentsmonitor the global interactive session and per-application session utilization values. The mon-itored data is aggregated by Aggregator Agents.This aggregation is done, for example, based onapplication, or based on session. This is explainedin more detail in Section 5. Enforcement Agentsuse this data to enforce the SLA and QoS require-ments. An Application Predictor system uses theaggregated data to predict the application behaviorwhich is reflected back in the application profilesused by the Grid DRM.

15. The Enforcement Agents end the global interac-tive session at the SLA specified time. The users’data and environment settings is copied back tothe persistent storage.

16. The execution node is now freed up to execute anew global interactive session if scheduled by theGrid Scheduler.

From the above description, we summarize thekey architectural concepts as (1) hierarchical ses-sions, (2) hierarchical admission control, (3) hier-archical agents, (4) classes of dynamic accounts,

Page 6: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

235

(5) application profiles, (6) user data management,(7) scheduling for interactive sessions, (8) persistentenvironment settings, (9) exporting remote desktop.In addition, there are runtime components providedto the end-user once the session starts. In this paper,we describe in more detail IGENV – the Interac-tive Grid runtime environment, hierarchical agents,classes of dynamic accounts, and user data manage-ment. More details on global admission control andscheduling for interactive sessions is currently a workin progress.

4. IGENV: Interactive Grid Environment

We propose IGENV: a runtime environment for graph-ical interactive sessions, in our proposed InteractiveGrid Computing system. IGENV consists of the fol-lowing components:1. Controlled Shell: GISH – ‘Grid Interactive Shell’.2. Controlled Desktop.3. SAC – ‘Session Admission Control’ module.4. GMMA – ‘Grid Monitoring and Management

Agents’.5. Policy Engine.

These components provide at runtime fine grainaccess control, and QoS in interactive grids. Figure 2shows the interaction of IGENV components in thecontext of a dynamic account created by the DynamicAccount Manager. In the next few sections, we de-scribe GISH, Controlled Desktop, GMMA, SAC, andPolicy Engine.

Figure 2. Interaction among IGENV components.

4.1. Controlled Shell

A controlled shell provides a restricted interface to theend-user to submit requests for executing applicationsand commands to the remote node interactively. Wehave designed a controlled shell called GISH – ‘Gridinteractive shell’. GISH accepts requests to executetwo kinds of commands/applications:1. Commands and applications that are already in-

stalled on the remote node by the system admin-istrator.

2. Commands and applications that are NOT alreadyinstalled on the remote node, and is a user specifiedbinary file.GISH allows grid-users to be logged on to the

remote node through two kinds of user accounts:1. Controlled normal user accounts. This corresponds

to a normal user account given by the underlyingoperating system, restricted by the access controlpolicy files.

2. Controlled super user accounts. This correspondsto a super user account given by the underlyingoperating system, restricted by the access controlpolicy files.Figure 3 shows the design of GISH. It consists of

a command interpreter interfaced to an access con-trol subsystem. The access control subsystem consistsof access control modules described in detail below.The user submits a request to start a command orapplication to GISH. The command is first parsedby the command interpreter, and then passed ontothe access control modules. Each of the access con-trol modules performs an access control check. Ifthe access control check fails for any of the mod-ules, a failure message is returned back to the userand the request to start the application/command isdenied. If the access control check succeeds for allthe modules, then the command or application isstarted by GISH and the graphical output, if any, canbe viewed through the remote graphical display. Wedescribe briefly some of these access control mod-ules below. In order to make the design modular,we choose to interface GISH with a Session Ad-mission Control system (SAC). SAC is responsiblefor making admission control decisions for sessionparameters.

4.1.1. EAM: Executables and Files Access ControlModule

This module is responsible for verifying that the re-quested command/application (i) belongs to the list of

Page 7: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

236

Figure 3. GISH design.

allowed executables, (ii) is invoked with a list of al-lowed arguments/options, (iii) only accesses allowedfiles and directories. This verification is enforcedthrough a policy file which enumerates the list ofallowed executables, allowed executable arguments,allowed files and directories for the user. The policyfiles used by EAM is categorized based on whetherthe user is logged on as a controlled normal useror as a controlled superuser. For allowed executa-bles, EAM would not be able to determine all of thefiles and directories that an application would access.In order to restrict the applications from accessingonly the allowed files and directories at run-time, wesupplement GISH, based on a policy decision, withsystems like [16, 17] which compartmentalize the ex-ecution of processes, or with virtual machine sandboxenvironments like [18].

4.1.2. UbAM: User Binaries Access Control ModuleThis module is responsible for verifying the signaturefor user specified binaries. We assume that there existstrusted services in our grid computing environmentthat checks user specified binaries and signs non-malicious binaries. UbAM verifies such signatures. Ifsuch trusted services are unavailable to the user, weprovide based on a policy decision, a virtual machineenvironment [18] for executing the users’s binaries,or supplement the system with runtime system-callmonitoring systems [10].

4.1.3. SAM: Session Access Control ModuleThis module interfaces with SAC – ‘Session Admis-sion Control’ module, for verifying session specificparameters. SAC is explained in detail in Section 4.3.

SAM passes the requested command to the SAC. SACreplies back with an ‘Allow’ or ‘Deny’ decision for therequested command/application.

The GISH design shown in Figure 3 could be ex-tended with other access control modules as seemedappropriate for a particular implementation. We havepresented a few access control modules that we en-vision to be necessary in a Grid environment forgraphical interactive sessions.

4.2. Controlled Desktop

The controlled desktop has to be identical to GISH interms of the policies enforced. The desktop’s menusand icons is customized by a desktop configuration filethat enforces these policies. At the time of initializa-tion of the session, this file is read in for customizingthe desktop. The user is not given permission to mod-ify this file, or to add or modify menu items or icons.Only the allowed executables with allowed arguments,and allowed files for the user is accessible through thecontrolled desktop.

4.3. SAC

SAC stands for Session Admission Control module.This module is responsible for making an admissioncontrol decision for a requested application, based onService Level Agreements (SLAs), and session poli-cies. Figure 4 shows the inputs to a SAC system. Theseare explained below:(1) Requested application. The graphics application

which the user is requesting to be launched. Thisis provided to SAC by GISH through the SAMmodule.

Page 8: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

237

Figure 4. SAC design.

(2) SLA. The Service Level Agreement for the sessionin progress. The SLA is determined prior to thestart of the session.

(3) Application profiles. The application profiles con-tain the estimated CPU and bandwidth requiredfor various classes of applications to meet theiracceptable performance levels. Example classesof applications are engineering applications, vi-sualization applications, video games, etc. Suchapplication profiles are determined by a systemadministrator, and refined by an application pre-dictor system.

(4) Data from GMMA agents. The resource usagedata gathered by GMMA monitoring agents. TheGMMA monitoring agents are explained in Sec-tion 4.4.

(5) Policies. The session policies in place for thesession.Given these inputs, SAC checks the session para-

meters to verify availability of resources in complianceto SLAs. These session parameters are:(1) Number of processes launched during a session.(2) Usage time for a session.(3) Disk quota usage for a session.(4) CPU utilization percentage for a session.(5) Network bandwidth utilization percentage for a

session.SAC compares the current values for these session

parameters with the limiting values agreed upon in theSLA. If there is a violation, or if a violation wouldoccur upon executing the application, SAC decideson a ‘Deny’ decision for executing the application.Otherwise, an ‘Allow’ decision is made for the appli-cation by SAC. Figure 5 shows an algorithm for SACto make an admission control decision, based on theCPU and network bandwidth utilization parametersfor a session.

SAC could be extended to support other sessionparameters as seemed appropriate for a particular im-plementation. We have presented a few session pa-rameters that we envision to be necessary in a Gridenvironment for graphical, interactive sessions.

Figure 5. An algorithm for SAC with CPU and network bandwidthutilization as the session parameters.

4.4. GMMA

GMMA stands for ‘Grid Monitoring and ManagementAgents’. The monitoring agents collect dynamic mon-itoring data, which is used by the management agentsto enforce session SLAs, QoS for applications, in-trusion protection, and access control policies. Someof the GMMA agents are associated with a specificsession, while some others are system wide agentsthat monitor all the sessions started through the Inter-active Grid environment. The monitoring agents logtheir information in log files,3 interface with otherpeer agents, other monitoring systems, as needed. Themanagement agents use the data gathered by monitor-ing agents for enforcement purposes. Figure 6 showsthe design of GMMA. The figure shows a few cate-gories of these agents based on the functionality to beprovided by these agents in the context of InteractiveGrids. For example, some of the functionality is tomonitor and manage (i) session specific parameterslike usage time for the session, number of processesspawned during the session, number of socket con-nections opened during the session, disk quota usagefor the session, etc.; (ii) QoS parameters like CPUutilization, network bandwidth utilization, etc., for

3 This logged data is used only for session and Grid managementpurposes. Any privacy issues regarding this information would beagreed upon as an agreement prior to the start of the session.

Page 9: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

238

Figure 6. GMMA design.

Figure 7. Taxonomy of IGENV system policies.

the applications; (iii) security parameters for intrusiondetection and access control like IP addresses of in-coming and outgoing connections, system calls madeby applications, etc.

The GMMA design presented in Figure 6 couldbe extended with other monitoring and managementagents as seemed appropriate for a particular imple-mentation. As for GISH and SAC, we have presented afew categories of monitoring and management agentsthat we envision to be necessary in a Grid environmentfor graphical interactive sessions. However, the designcan be extended with other categories of monitoringand management agents as well. In Section 5, we de-scribe the architectural aspects for the monitoring andmanagement agents in more detail.

4.5. Policy Engine

Policy files are needed for SLA enforcement and se-curity issues. We also need rules covering differentscenarios. Thus, a process violating its SLA can be runwith a lower priority or be killed under different cir-cumstances, as dictated by policy. Instead of allowingpolicy-based decisions to be internal to other modulessuch as GISH, SAC and GMMA, we follow a modulardesign. The policy engine is driven by rules and cantake decision based on configuration parameters in thepolicy files and information collected by monitoringagents. These decisions are taken when requested, forexample, by SAC or management agents. The systempolicies can be classified into the following categoriesalso shown in Figure 7:

Page 10: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

239

Figure 8. Example system policy files.

(1) Session policies. These specify policy informationfor each session. Examples of such policies areaccounting and pricing policies, CPU and processusage policies, file system and disk quota usagepolicies. The policies specify the default action tobe taken on a violation of the system parameters.

(2) Account policies. These specify policy informa-tion associated with account pools. There areseparate policies for controlled normal users andcontrolled superusers. Examples of such policiesare the authorization polices for executables andfiles for a user of the account pool.

(3) Application policies. These specify policy infor-mation for applications that are started by IGENV.There are two kinds of applications: installedapplications, and user specified binaries. The ex-ecution of these applications could take placein a secure environment using systems like Pit-Bull [16], HP-LX [17], or virtual machine en-vironments [18]. Such systems have their ownpolicies.

(4) QoS policies. These specify policy informationfor QoS metrics. Example policies are the QoSenforcement policies on violation of QoS metricsspecified in the application profiles.

Each of the above policies are customized for agiven grid-user of the system. Figure 8 shows exam-ples of some system policy files.

4.6. Discussion for IGENV

Figure 2 shows the interaction among the compo-nents of IGENV. As shown in the figure, there is atight coupling among the components. These com-ponents exist in the context of a dynamic account

created by a Dynamic Account Manager. Together, theIGENV components achieve access control, QoS, andManageability. We explain this below.

4.6.1. Access Control through IGENV

Access control for a grid-user is achieved through acombination of GISH, Controlled Desktop, GMMA,SAC, and system policies. Using these components,we can control the access of the user to (i) executa-bles, (ii) files, (iii) network interfaces, (iv) networkconnections, (v) resource usages decided in SLA.

4.6.2. QoS through IGENV

QoS in IGENV is achieved through a combination ofSAC, GMMA, and system policy files. We assumethat the the Grid middleware would have made CPUand network bandwidth reservation, before the ses-sion is launched on the remote node. We also assumethe existence of application profiles which contain theestimated CPU and network bandwidth required forvarious classes of applications to meet acceptable per-formance levels. GMMA monitoring agents monitorthe actual usage values of the CPU and network band-width utilization of applications. This data is used toenforce QoS for each application as specified in theapplication profiles. The data gathered by GMMAmonitoring agents is also used by the GMMA man-agement agents for enforcing the reservation limits forsessions stated in the SLA (policing). Further, beforea new application is launched, the Session Admis-sion Control System (SAC) verifies that the requestedapplication would consume resources within the reser-vation limits agreed upon for the session. This checkensures that the SLA and QoS guarantees for currently

Page 11: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

240

Figure 9. Agents on the execution node.

Figure 10. High level overview of the coordination model for components on the execution node.

executing applications, would not be violated uponlaunching the application.4

4.6.3. Manageability through IGENV

The system policy files are associated with a pool ofdynamic user accounts. A grid user is mapped to oneof these pools of dynamic accounts based on a VO-wide policy. The user is then dynamically allocatedone of the accounts from the mapped account pool.The user is subject to the system policies associatedwith that account pool. After the session for the userexpires, the dynamic account is returned back to thepool. Such a design coupled with the access controland monitoring agents system provides for easy man-ageability through the proposed Grid Environment,IGENV.

4 Even if the SAC were not to perform this admission controlcheck, the monitoring agents would detect the violation and an ap-propriate enforcement action would be taken. However, there wouldbe a time delay before such an action can take place. Performing acheck at the SAC itself ensures no violation of SLA even for thistime delay period.

5. Hierarchical Agents

We introduced the GMMA agents as part of the run-time environment, IGENV, in Section 4.4. In thissection, we describe the architectural concepts for theagents. Figure 9 shows the agents on the executionnode. We assume a master agent that is responsiblefor all of the agents on the execution node. Figure 10shows how some of the agents coordinate and in-teract with each other. This coordination model isbased on the producer-consumer paradigm [19]. InFigure 10, some of the components act as both pro-ducers and consumers. The source data is providedby Sensor Agents like CPU sensors, memory sen-sors, network bandwidth sensors, and storage sensors.Monitoring Agents interface to these Sensor Agents,and act as a ‘Producer’ to consumers – Aggrega-tor Agents, Registration Agents, and other archivalagents. The Aggregator Agents themselves serve asproducers to an Application Predictor system, En-forcement Agents, and Session Admission ControlSystem. The agent implementations could follow openstandards like FIPA [20, 21]. The system can be ex-tended to support a registry service, which would

Page 12: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

241

aid in supporting information publication about com-ponents, and discovery of components. Figure 10shows these components residing on a single node. Wedescribe the agents below.

5.1. Startup and Configuration Agent

This agent is responsible for launching a new globalinteractive session on the execution node. This agent isalso responsible for configuring the system appropri-ately for the launched global interactive session. Forexample, in our implementation, this agent configuresthe KDE desktop environment based on the systempolicy files corresponding to the allocated dynamicaccount. Our implemented Startup Agent also startsup a VNC server and connects to the end users’ VNCclient thus establishing a graphical, global interactivesession.

5.2. Sensor Agents

The Sensor Agents collect resource information in realtime on a continuous basis. These Sensor Agents areoff-the-shelf sensors like CPU sensors, memory sen-sors, network bandwidth sensors, and storage sensors.The Monitoring Agent interfaces with these sensors toobtain this resource information.

5.3. Monitoring Agent

The Monitoring Agent acts as a ‘Producer’ and makesresource usage data available to other components.It itself obtains the resource usage data from Sen-sor Agents. The Monitoring Agent uses the producerinterface as being defined in the Grid Monitoring Ar-chitecture [19] to send events to a consumer. Theevent data is the overall and per-application resourceusage data (CPU, network, memory, storage) ob-tained from the Sensor Agents. The Monitoring Agentcould also apply a prediction model on the gathereddata and supply the forecasted resource load valuesto the consumers, for example, the predicted CPUload assuming current set of processes. Based onimplementation choice, separate Producer interfacesand interaction channels may be required for each re-source type like CPU, memory, network bandwidth,etc. The consumers for the Monitoring Agent in ourframework are Aggregator Agents, and RegistrationAgents. These consumers subscribe to the event datamade available by the Monitoring Agent using pub-lish/subscribe model. The Monitoring Agent sends

the event data to these consumers at periodic inter-vals agreed upon in the subscription. Other interactionmodels may also be considered based on implementa-tion choice. Other consumers of the Monitoring Agentdata could be archival agents for storing the history ofresource usage information, fault detectors to detectresource aliveness. Based on implementation choice,the event data could be sent as messages to the con-sumers, or could be communicated via shared memoryparadigm. The exact protocols and data formats to beused are implementation dependent.

5.4. Aggregator Agent

The Monitoring Agent provides raw resource datafrom the Sensor Agents. However, we need a frame-work to support the aggregation of this data. Aggrega-tion is the process of consolidating the resource dataobtained from lower level Monitoring Agents. Aggre-gation allows compaction of data to minimize storage,application of filtering for interpreting data at variousgranularities, and re-organization of data for comput-ing statistics and inferring events. The AggregatorAgent behaves as a compound Producer/Consumer.It acts as a Consumer to subscribe to per-applicationresource usage data from the Monitoring Agent. Theinterval for receiving the event data is determinedthrough policies, and is agreed upon in the subscrip-tion. The Aggregator Agent aggregates this receiveddata based on an aggregation function and policies.Some examples of this aggregation are: (i) aggregationby time: all the processes running at a particular pointin time are aggregated together. This helps in inferringthe load on the system at a particular time; (ii) ag-gregation by application: for a given application, weaggregate the statistics of the application at multiplepoints in time. This helps in inferring the behavior ofapplications; (iii) aggregation by session: for a givensession, we aggregate the statistics of the applicationsbelonging to the session, for total session resourceusage values like total number of processes launchedduring the session; the total session wall-clock usagetime; total session CPU, network bandwidth, storageutilization. We also obtain information about the av-erage, minimum, and peak resource utilization valuesfor a session. Aggregating data based on session helpsin inferring the session behavior, and the conformanceto SLAs.

The Aggregator Agents can also apply samplingto the data obtained from the Monitoring Agents ata coarser time interval, thus filtering out some data

Page 13: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

242

Figure 11. Hierarchical Aggregator nodes.

and reducing the data set. The aggregated data isalso archived into persistent storage. Prediction mod-els could also be run on the aggregated data to obtainthe forecasted resource utilization per application, orper global session, assuming current set of processes.The Aggregator Agent acts as a ‘Producer’ for the ag-gregated data to consumers. Some of the consumerswe have identified are Application Predictor system,Enforcement Agents, and Session Admission Con-trol System. The interaction between the AggregatorAgent and consumers is based on publish/subscribe orquery-response model. Similar to Monitoring Agent,the Aggregator Agent uses the producer and consumerinterface as defined in the Grid Monitoring Architec-ture [19]. The event data format, and communicationparadigm is implementation dependent. The Aggrega-tor nodes host the Aggregator Agents. Figure 11 showsa hierarchy of Aggregator nodes. The lowest level cor-responds to execution nodes. In addition to sendingthe aggregated data to the consumers on the executionnode, the Aggregator Agents on the execution nodesalso send their aggregated data to the intra-cluster Ag-gregator node. The intra-cluster Aggregator nodes inturn send their aggregated data to inter-cluster Aggre-gator node, thus forming a hierarchy. The AggregatorAgents on each of these hierarchical Aggregator nodesexecute different aggregation functions.

5.5. Enforcement Agent

The Enforcement Agent is responsible for enforcingService Level Agreements (SLAs) for global sessions,and providing guaranteed QoS for graphics applica-tions. The Enforcement Agent behaves as a ‘Con-sumer’ to receive aggregated resource usage data from

Aggregator Agent. The interval for receiving the datais determined through policies, and is agreed upon inthe subscription. These agents take as input the datafrom the Aggregator Agents, the SLAs for the globalsessions, the application profiles, and policies. Us-ing these inputs, it checks for violation of the SLAsor QoS guarantees. Once a violation is detected, anenforcement action is taken. For example, this en-forcement action could be one or combination of thefollowing: (i) decrease the priority of applications thatexceed their resource utilization levels; (ii) increasethe priority of applications falling below their desiredresource utilization levels; (iii) kill applications thathave violated their resource utilization levels by a largeamount. The enforcement process is controlled bypolicies.

6. Dynamic Accounts

We propose dynamic or template accounts in Inter-active Grids to make the resource virtualization moreappropriate for grids. The scalability and manageabil-ity of the system are enhanced if we do not requiregrid users to have their personal user accounts on allthe machines that are part of the grid. Instead thesystem administrator has to add the user once to adirectory maintained by the virtual organization inwhich the user has obtained membership. Any sitethat participates in that virtual organization (VO) willcheck the user’s membership with the directory duringauthentication, and authorize the user as a dynamicaccount if he does not have a static account. The dy-namic account is chosen from the pool of dynamic

Page 14: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

243

Figure 12. Flowchart describing the process of account allocation,access control and session management.

accounts maintained for that VO. Each dynamic ac-count is a full-fledged Unix account created on theexecution node, but without a permanent real-worlduser associated with it.

Each pool is associated with a set of policy files,customized to the target users of that pool. Unlikenormal user accounts that belong permanently to theirreal-world owners, a dynamic account is bound to auser temporarily. The selection of a pool and the bind-ing of the user to an available dynamic account fromthat pool are based on the Grid credentials presented.After the successful selection and binding of user toa dynamic account, the graphical interactive sessionis started. The window manager, terminal windowsrunning the GISH shell, and other programs specifiedin the window manager’s startup files are all startedas processes owned by the allocated dynamic account.The entire process can be described by a flowchartshown in Figure 12. The dynamic account is freed atthe termination time agreed upon for the session thatis using the dynamic account. At the termination time,the management agents kill the processes still running

with this account as owner, and delete all files ownedby the account. The account is then returned to thepool. We could also choose to archive the files createdby the user as against deleting it, on a server main-tained by the VO. Subsequent sessions for this userretrieve the files from the archive. Section 7 describesmore about the data management aspects.

We organize the dynamic accounts into classes tosimplify the work of the system administrator. Thesystem administrator may have to add a new class ormodify the attributes of an existing class occasionally,however these activities will not be frequent and thetime required will be much less than managing thepolicy files separately for each user or group or users.Each class can correspond to a role in the real world.Thus, members of the finance department can be as-signed to one class, while members of the engineeringdepartment can have another class. In our scheme,each class of dynamic accounts is associated with itsset of policy files. This set includes, among others,a policy file for resource usage. This can specify theminimum and maximum CPU and network bandwidthutilization by the user’s session. Another policy filelists the commands the user is allowed to execute. Yetanother policy file lists the directories and files the useris allowed to access.

The different classes of dynamic accounts arearranged in a hierarchy for ease of administration. Weillustrate this hierarchy by considering the functionsof system administrators. Common functions such astesting and releasing OS patches, restoring files frombackup and monitoring and enforcing resource usagelimits can be placed in a base class so that all sys-tem administrators can perform these functions whenneeded. Beyond that, there can be groups of systemadministrators who are responsible for installing andmaintaining certain groups of applications. We keepa class of dynamic accounts for each of these appli-cation groups. The list of commands and applicationsallowed for these classes and the directories and filesthey are allowed to modify are specific to their applica-tion group. When a system administrator connects toa grid computing resource and is allocated a dynamicaccount, the set of policy files governing that accountis created by the combination of policies appropriateto this administrator’s roles. Thus, if he is responsiblefor application groups A and B, his policy files willbe obtained by merging the basic privileges of systemadministrators and the privileges of application groupsA and B obtained from their respective policy files.By organizing the classes in a hierarchy, we ensure

Page 15: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

244

that each class inherits the policy files of its parentclasses. Conflict resolution rules are used to determinethe value of an attribute inherited from multiple par-ents. A class can add commands and applications tothose its parent classes allow to be executed. It canalso override inherited attributes.

7. Data Management

In this section, we describe the architectural featuresfor data management in Interactive Grids. Once anexecution node has been assigned for the user, theGrid middleware needs to ensure that the user’s filesare present on a filesystem accessible from the execu-tion node assigned to him. Furthermore, access rightsfor the user’s files have to be given to the dynamicaccount assigned to him. We assume that the users’files are archived in a Grid Storage Service (GSS).For interactive sessions on an execution node allocatedby the Grid DRM, the user’s files containing his ses-sion state will have to be restored at the beginning ofevery interactive session and saved at the end of thesession since the dynamic account is associated withthe user only for the duration of the session. The GridStorage Service is used to store the users’ files at theend of the session, and the files are then transferred tothe appropriate execution node at the beginning of anew interactive session. To minimize the time taken totransfer files at the beginning of an interactive session,it makes sense (i) to cache the users’ data on the exe-cution node at the end of the interactive session insteadof deleting it; (ii) for the DRM to schedule interactivesessions with affinity to the execution node used duringthe previous interactive sessions by the same user. As-suming that the execution node selected was used in aprevious session by the user, there is still no guaranteethat the files cached locally will be the latest versionsince the user might have been assigned a session atanother site by the grid middleware between two con-secutive sessions on the resource selected. During thesession at the other site, he might have modified thefiles. Hence the middleware should verify whether anyof the files cached locally are stale, and if so, invalidatethem or get their updated versions, in the background,from GSS.

7.1. Data Affinity Scheduling Algorithm

The steps involved in the data affinity scheduling algo-rithm for allocation of an execution node with dynamic

account, in response to a user’s request for allocationsent to the grid middleware, are illustrated in a flow-chart in Figure 13. Affinity scheduling is possible onlyfor frequent users for whom an execution node hasbeen allocated at this site previously and is available.This is shown in step F. All other users are assignedexecution nodes in step E. This involves checking theresource, application and session requirements speci-fied by the user in his job template when requestingan interactive session. For frequent users, this alsoinvolves checking history of previous sessions. If theuser frequently requested interactive sessions duringcertain time periods, it is best to assign executionnodes that are expected to be available during thosetime periods. After selection of the execution node,a check is done on the execution node for the ex-istence of a home directory named by mapping theuser’s grid credentials. If a directory exists and isowned by a reserved dynamic account, the dynamicaccount is assigned to the user in step K. If a non-reserved dynamic account is assigned, ownership ofthe user’s home directory tree, if it exists, is givento the dynamic account. The user is logged in withhis user ID being the dynamic account assigned. Hefinds his shell and desktop customized according tothe policies being enforced. Files that are stale will betemporarily unavailable while they are updated in thebackground from the grid storage service. At the endof the interactive session, all files that have been mod-ified during the session are updated in the grid storagesystem using the user’s grid credentials. Also, theuser’s dynamic account is put in a reserved pool if heis a frequent user. If he returns for another interactivesession before the account is reused, the ownership ofthe files do not have to change.

8. Analysis

We now provide an analysis of our solutions as de-scribed in the previous sections. While designing thesystem, we came up with a list of requirements forour solutions to satisfy. These were: (1) be applicableacross heterogeneous platforms, (2) extend existinggeneral purpose tools, (3) require minimal changesto the existing system software, (4) be extensible andmodular, (5) address needs of graphics and multimediaapplications, (6) support self-managing capabilities,(7) work for all application types, (8) be flexibleto interface and interoperate with other complemen-tary and grid solutions, (9) be driven by policies,(10) scalability.

Page 16: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

245

Figure 13. Flowchart describing data affinity scheduling algorithm.

Towards this end, we avoided as far as possible,designing solutions specific to an OS, or in-kernel so-lutions, or solutions requiring drastically new tools.This is reflected in our design, for example, throughGISH, which was implemented by extending existingpopular bash shell. In order to make the design self-managing, we propose monitoring and managementagents, which gather run-time information, and en-force appropriate enforcement actions to honor SLAsand access control policies. We also took a two-levelapproach of (1) filtering commands before executionto verify for access and admission control, and then(2) monitoring and managing the behavior of the sys-tem at run-time. In such a two-level approach, weintroduce tight feedback loops from the monitoring

agents to policy engine and session admission con-trol module, so that dynamic run-time informationgathered can be used for subsequent admission andaccess control decisions. This helps in making thesystem self-learning and self-managing. The two-levelapproach is also expected to provide performancebenefits by eliminating some of the non-compliantapplication requests at the shell itself. Since we de-sired to design the system to work for any application,we avoided QoS solutions like QuO [22] from theQuorum project [23], that specifically address needsof distributed object systems built over CORBA-likemiddleware infrastructures. Rather, in order to provideQoS management we provide (1) functionality in theInteractive Grid DRM to allocate the most appropri-

Page 17: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

246

ate available resource(s) that would satisfy the QoSrequirement of the application, (2) monitoring andenforcement framework to adjust fine-grain resourceallocations of applications at run-time to enforce QoSguarantees.

Our solution for the runtime components inIGENV is believed to be efficient as it proposes anintegrated set of components to together solve theproblems of fine grain access control, QoS, and man-ageability, compared to providing separate solutions toeach of these problems.

In terms of runtime performance analysis, we areinterested in reducing the overhead caused during aninteractive session in an interactive grid computingsystem. The overhead caused by Dynamic AccountManager is only at the beginning of a session, andis expected to be minimal since it would primarilyinvolve accessing the gridmapfile, and executing asimple logic to map the user to the appropriate ac-count. The overheads caused by the Controlled Shell– GISH, and Session Admission Control module –SAC, is also expected to be insignificant comparedto the high human response time expected while theuser is interactively submitting commands. The over-head caused by monitoring and management agentswould incur in terms of filesystem read access, agent-agent communication, enforcement algorithms. Thesecan be addressed through appropriate implementationtechniques and as future work, we are designing themonitoring and management agents taking these fac-tors into consideration, so as to not limit the usefulnessof the system due to these overheads.

9. Implementation

We have a working prototype implementation for In-teractive Grids. Our implementation environment con-sists of Intel x86 machines running Red Hat Linux 7.3,as the remote nodes. The end-user can request for oneof these remote nodes for interactive use from anymachine supporting a web browser. We use GlobusToolkit 2.0 [3] as the Grid middleware platform, andVNC [13] as the remote display technology for remotegraphical sessions. GPDK [24] is used to provide aweb portal to the end-user for submitting job requests.We have extended the functionality of Globus, so thatit can also be used for submitting requests for startinga graphical interactive session to the remote nodes.Globus GSI certificates are used for authentication.We have an implementation of a Scheduler that de-termines the appropriate node to be allocated for a

users’ request. Figure 14 in the Appendix shows thejob submission process. Once a node is determinedfor the users’ request, the appropriate account poolfor the user is determined. A set of policy files is as-sociated with this account pool. A dynamic accountfrom this pool is then allocated for this user. We thenstart a VNC server and GMMA agents for this session.On a successful VNC authentication, the user is pre-sented with a controlled KDE Desktop environmentcontaining only the applications and menus the useris allowed to access. The KDE desktop environmentis pre-configured by the system administrator for eachpool of accounts.

The session starts with default startup applications,including a GISH shell. The GISH shell has beenimplemented as an extension to the popular GNUbash shell for Linux and Windows. The shell sourcecode was modified so as to include the access controlmodules. GISH currently checks for list of allowedexecutables from a file, before executing commands.The GMMA Agents started at the beginning of thesession, run with super–user privileges. They recordthe session and system information and store themin predetermined files. Currently, the GMMA agentscheck for the usage time for the session, number ofspawned processes. The system policy files containinformation about the session usage time, number ofallowed processes, etc., along with their maximumallowed values, and actions to be taken on violationof these policies. The current default action is to killall the processes and end the session, on violation ofthe session policies. Implementation for other mod-ules and agents for GISH and GMMA is a work inprogress. Figure 15 shows some interaction exam-ples within GISH during a session in progress, andFigure 16 shows the session screen on termination.

To support dynamic accounts, we modifiedglobus_gatekeeper and GSI-SSH daemon from GlobusToolkit 2.0 by linking them with a modified libraryfor reading the gridmapfile. Normally the gridmap-file contains entries mapping the distinguished name(DN) of the user to the local Unix account. Asa first step, we modified the gridmapfile to re-place the local Unix account name with a prede-fined string for the user’s VO, indicating that a dy-namic account from the VO’s pool should be usedfor this user. To get our modified library for read-ing the gridmapfile, we started with a patch to theglobus_gss_assist package, distributed by the Grid forUK Particle Physics, obtained from [25]. As a sec-ond step, we extend this library further to remove

Page 18: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

247

the requirement of having an entry in the gridmap-file. When the user authenticates himself, the DNobtained from his certificate will be queried againstthe directory maintained by the VO for member-ship. If the user is a member, he will be assigneda dynamic account from the pool customized forthat VO.

10. Related Work

Majority of the work in the area of grid computinghas been for batch jobs and hence do not addressthe problems as outlined in the paper. The sameholds with recent projects on interactive applicationslike CrossGrid [9]. Solutions developed in Punchproject [10] do not address graphical and multimediasessions, gsissh [26] provides for encryption of inter-active sessions and can be used with our solution. QoSsolutions provided through Quorum project [23, 22]address the needs of distributed object systems andprovide QoS support in underlying middleware in-frastructure like CORBA. We propose QoS support forapplications through appropriate initial resource allo-cation and subsequent monitoring and adjustment ofresource allocations by the Grid resource managementframework. We do not assume any available supportfor application adaptation through multiple applica-tion behaviors. Several thin client system architectureshave been proposed including Citrix Metaframe [11]and SunRay systems [12]. However, to our knowl-edge, none of these systems have been considered inthe context of Grids. Our solution extends the scopeof thin client architectures to become part of Grids.Most of the other related work for access controland QoS are in the non-grid context do not com-pletely satisfy the requirements for Grid. For example,traditional OS access control mechanisms do not al-low to easily enforce fine grain access control forarbitrary end-users. Sudo [27] is not a replacementfor shell and does not provide a complete solutionfor all our needs in Grid context. Our solution pro-vides for an integrated and comprehensive solutionfor problems of access control, QoS, and accountmanagement in the context of graphical interactivesessions in Grids. Our solution is not specific to anyremote display technology and can be used with sys-tems like [13, 28]. Our architecture for hierarchicalagents leverages the architectural framework beingdefined in the Grid Monitoring Architecture [19]. Un-like the Grid Monitoring systems being developed

[29–31], our monitoring infrastructure addresses thegoal of providing QoS guarantees for graphical ap-plications. We plan to leverage the existing low levelhardware and software sensors to collect CPU, mem-ory, network bandwidth measurement data. Dynamicaccount management has been described in [32, 33].However, we differ from the prior work in using thedynamic account as a component of our customizablegrid environment, by associating each pool of dy-namic accounts with its set of policy files that IGENVenforces.

11. Conclusions

In this paper, we introduced Interactive Grids whichallow end-users access to remote execution nodesbelonging to a Grid, for graphical interactive use.Interactive Grids extend the application domain forGrid computing systems from traditional batch jobsto graphical, interactive sessions. We described someof the problems posed for the design of interactivegrids, namely that of fine grain access control, perfor-mance QoS, dynamic account management, schedul-ing, and user data management. In this paper, wehave presented key architectural concepts in designingInteractive Grids. They are hierarchical sessions, hier-archical admission control, hierarchical agents, classesof dynamic accounts, application profiling, user datamanagement, wide area grid scheduling for interactivesessions, persistent environment settings, and export-ing remote desktop. We also presented IGENV: aruntime environment for enabling interactive grids.Our approach has been in building a set of compo-nents addressing the issues of fine grain access control,QoS, and manageability, in an integrated but modularmanner. We believe this leads to more efficiency. Thecomponents are GISH – ‘Grid Interactive Shell’, Con-trolled Desktop, SAC – ‘Session Admission Control’system, GMMA – ‘Grid Monitoring and ManagementAgents’, Policy Engine. While designing the system,we realized the need to satisfy important requirementsof heterogeneous platforms, easy extensibility, modu-larity, and self-managing capability. We identified theareas of overheads that would occur with a deploymentof our solution, which would be considered for op-timization in the future work. We also described ourimplementation of the system on Linux x86 machines,using and extending Globus Toolkit 2.0 for the basegrid middleware infrastructure and VNC as the remotedisplay technology.

Page 19: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

248

Appendix

Figure 14. Job submission screen for a graphical interactive session in an Interactive Grid computing system.

Figure 15. Screen during an interactive session in progress.

Page 20: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

249

Figure 16. Screen at a session logout.

References

1. I. Foster and C. Kesselman (eds.), The Grid: Blueprint for aNew Computing Infrastructure, Morgan Kauffman Publishers,1999.

2. I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of theGrid: Enabling Scalable Virtual Organizations”, InternationalJournal of SuperComputing Applications, Vol. 15, No. 3,2001.

3. I. Foster and C. Kesselman, “Globus: A Metacomputing In-frastructure Toolkit”, International Journal of SuperComput-ing Applications, Vol. 11, No. 2, pp. 115–128, Summer 1997.

4. Nasa ipg. http://www.ipg.nasa.gov.5. Teragrid project. http://www.teragrid.org.6. Eurogrid project. http://www.eurogrid.org.7. Entropia. http://www.entropia.com.8. S. Basu, V. Talwar, B. Agarwalla and R. Kumar, “Interac-

tive Grid Architecture for Application Service Providers”, inProceedings of the International Conference on Web Services(ICWS), June 2003.

9. Crossgrid. http://www.crossgrid.org.10. A.R. Butt, S. Adabala, N. Kapadia, R. Figueiredo and

J.A.B. Fortes, “Fine-grain Access Control for Securing SharedResources in Computational Grids”, in IPDPS, April 2002.

11. http://www.citrix.com.12. http://wwws.sun.com/sunray/sunrayl/.13. T. Richardson, Q. Stafford-Fraser, K.R. Wood and A. Hop-

per, “Virtual Network Computing”, IEEE Internet Computing,Vol. 2, No. 1, pp. 33–38, 1998.

14. Globus mds. http://www.globus.org/mds.

15. P. Mundur, R. Simon and A. Sood, “Integrated AdmissionControl in Hierarchical Video-on-demand Systems”, in Pro-ceedings of the IEEE International Conference on MultimediaComputing and Systems, June 1999, pp. 220–225.

16. Pitbull lx white papers. http://www.argus-systems.com.17. N. Edwards, J. Berger and T.H. Choo, “A Secure Linux

Platform”, in 5th Annual Linux Showcase and Conference,November 2001.

18. E. Bugnion, S. Devine, K. Govil and M. Rosenblum,“Disco: Running Commodity Operating Systems on ScalableMultiprocessors”, ACM Transactions on Computer Systems,Vol. 15, No. 4, pp. 412–447, 1997.

19. R. Tierney, R. Aydt, D. Gunter, W. Smith, M. Swany, V. Tay-lor and R. Wolski, “A Grid Monitoring Architecture”, GGFDocument Series available from http://www.gridforum.org.

20. Fipa. http://www.fipa.org/repository/index.html.21. Bigus and Bigus, Constructing Intelligent Agents Using Java,

2nd edn, John Wiley, 2001.22. J. Zinky, D. Bakken and R. Scantz, “Architectural Support for

Quality of Service for Corba Objects”, Theory and Practice ofObject Systems, Vol. 3, No. 1, 1997.

23. Quorum. http://www.dist-systems.bbn.com/projects/QuOIN/.24. J. Novotny, “The Grid Portal Development Toolkit”,

Concurrency-Practice and Experience, 2000.25. http://www.gridpp.ac.uk/gridmapdir.26. Gsi-ssh. http://www.ncsa.uiuc.edu/Divisions/ACES/GSI/

openssh/.27. Sudo. http://www.courtesan.com/sudo/.28. Sgi opengl vizserver. http://www.sgi.com/software/vizserver.

Page 21: Architecture and Environment for Enabling Interactive Grids · Grid Computing [1, 2] envisions a future where het-erogeneous resources could be shared by users across geographical

250

29. B. Tierney, B. Crowley, D. Gunter, M. Holding, J. Leeand M. Thompson, “A Monitoring Sensor Management Sys-tem for Grid environments”, in Proceedings of the NinthIEEE International Symposium on High Performance Distrib-uted Computing, IEEE Computer Society, Washington, DC,pp. 97–104, August 2000.

30. A. Waheed, W. Smith, J. George and J. Yan, “An Infrastructurefor Monitoring and Management in Computational Grids”,in Selected Papers from the 5th International Workshop onLanguages, Computers, and Run-time Systems for ScalableComputers, Springer-Verlag, London, pp. 235–245, 2000.

31. R. Wolski, N. Spring and J. Hayes, “The Network WeatherService: A Distributed Performance Forecasting Servicefor Metacomputing”, Future Generation Computer Systems,Vol. 15, Nos. 5–6, pp. 757–768, 1999.

32. T.J. Hacker and B.D. Athey, “A Methodology for AccountManagement in Grid Computing Environments”, in 2nd In-ternational Workshop on Grid Computing, November 2001,Springer-Verlag.

33. “An Accounting System for the Datagrid Project Version3.0”. http://server11.infn.it/workload-grid/docs/DataGrid-01-TED-0115-3_0.pdf.