50

A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation
Page 2: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

A Grid-enabled Science Portal for Collaborative

Coastal Modeling

Master of Science, Systems Science

Project Report

submitted to

Department of Computer Science,

Louisiana State University

Chongjie Zhang ∗

March 28, 2006

∗Department of Computer Science, Louisiana State University, Baton Rouge, LA 70803.

Page 3: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Abstract

The Southeastern United States is regularly impacted by severe ocean-driven events such

as hurricanes that affect the lives of hundreds of thousands of citizens. It is urgent to im-

prove our ability to predict critical coastal phenomena within the region. The SCOOP

program is intended to create an open access, distributed laboratory for scientific research

and coastal operations by integrating the coastal data, computing resources, and research ef-

forts from its partners. A Grid Computing infrastructure is deployed to support researchers’

work by allow sharing expertise, software, and data across multiple institutions. This re-

port presents the SCOOP portal, a Grid-enabled science portal, developed for the coastal

research community to reduce the complexity of high performance computing for end users

and enable advanced application scenarios. Since Grid portal toolkits are increasing being

adopted as a means to speed up the Grid portal development, we extensively investigate

and compare two major Grid portal toolkits, OGCE and GridPortlets. Then details about

the SCOOP portal development are described, including use cases, design, and implemen-

tation. The SCOOP portal, built with the GridSphere Framework, currently integrates cus-

tomized Grid portlet components for data access, job submission, resource management

and notification.

i

Page 4: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Acknowledgements

I express my sincere gratitude to everyone in any way connected to the work presented

in this report. First and foremost I would like to thank Gabrielle Allen for her constant

support, guidance, encouragement and invaluable time that she put in to help me make

progress towards completion of this project. I also would like to thank the rest of my pro-

gram committee, Jianhua Chen and Bert R. Boyce, for their helpful comments and guid-

ance. My work benefits greatly from a friendly and productive cooperation with other

members in the SCOOP team and Grid research group at the Center for Computation and

Technology. My report makes use of two papers for which I am the primary author, but

which contain constructive comments and suggestions from other authors. I am specially

grateful to to Jon MacLaren for his effective management on SCOOP project and work

on archive services; to Ian Kelley for his comments and suggestions on Grid portal devel-

opment; and to Chirag Dekate for his ideas and work on coastal modeling scenarios. I

would like to sincerely thank Michael Russell, Jason Novotny, and Oliver Wehrens for their

support on GridSphere and GridPortlets during the project. Finally, I would like to thank

Xiaoxi Xu for keeping me happy and sane throughout the process, my father and my mother

for teaching me early on to love learning, and my brother, whose enthusiasm and support

helped me to reach this point.

ii

Page 5: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Acknowledgements iii

This project was carried out as a component of the SURA Coastal Ocean Observing and

Prediction (SCOOP) Program, an initiative of the Southeastern Universities Research As-

sociation (SURA). Funding support for SCOOP has been provided by the Office of Naval

Research, Award #N00014-04-1-0721 and by the National Oceanic and Atmospheric Ad-

ministration’s NOAA Ocean Service, Award #NA04NOS4730254. Additional support was

provided by the Center for Computation & Technology at Louisiana State University.

Page 6: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Table of Contents

1 Introduction 11.1 Major Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Grid Portal Development 42.1 Grid Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Grid Portal Toolkits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 GridPortlets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2 OGCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2.3 Comparison between GridPortlets and OGCE . . . . . . . . . . . . 11

3 Collaborative Coastal Modeling 153.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Science Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.3 CCT Role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 SCOOP Portal 214.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.2 Choice of Grid Portal Toolkits . . . . . . . . . . . . . . . . . . . . . . . . 234.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.4.1 Archive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.4.2 SCOOP Job Management . . . . . . . . . . . . . . . . . . . . . . 31

5 Conclusion 36

iv

Page 7: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

List of Figures

2.1 GridPortlet Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 OGCE 2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1 SCOOP Grid Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1 SCOOP Portal architecture based on GridSphere. . . . . . . . . . . . . . . . . 254.2 Archive portlet using metadata for querying and retrieving SCOOP data files from

storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.3 Copy retrieved data files to a remote machine. . . . . . . . . . . . . . . . . . . 294.4 Manage files through physical file management portlet . . . . . . . . . . . . . . 304.5 Job submission portlet for running ensembles of simulation models on different

types of data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.6 Job submission portlet for running ensembles of simulation models on different

types of data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.7 Show output of each sub-job run. . . . . . . . . . . . . . . . . . . . . . . . . 34

v

Page 8: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

List of Tables

2.1 Comparison between GridPortlets and OGCE 2 . . . . . . . . . . . . . . . 12

vi

Page 9: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Chapter 1

Introduction

The Southeastern United States hosts roughly 100 million citizens and supports five naval

bases, over a dozen major ports, essential commercial shipping and fishing enterprises,

major oil and natural gas reserves, and thriving tourist enterprises. It is also a region that

frequently suffers from severe ocean-driven events such as hurricanes that affect the lives

of hundreds of thousands of citizens yearly. The recent devastation to coastal Louisiana

by Hurricanes Katrina and Rita cost the lives of over 1000 people and severely damaged

the economy and the environment. Hence there is an urgent need for accurate models of

hurricanes and other severe weather events within the region. Such accurate models are

needed to predict the path and effect of impending hurricanes for evacuation and prepara-

tion, to design better coastal defense systems, and to understand the physics and trends of

hurricanes.

In an effort to improve model fidelity, the Southeastern Universities Research Associ-

ation, or SURA, has initiated and funded the SURA Coastal Ocean Observing Program

(SCOOP) (1), partnering with ten institutions near the Gulf and Atlantic coasts, including

Louisiana State University (LSU). SCOOP is representative of a growing class of geograph-

1

Page 10: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

1.1. Major Contributions 2

ically distributed collaborators who have realized the need for new infrastructures, such as

Grid computing (2), to support their research work in today’s world of complex applications

which require sharing expertise, software, and data across multiple institutions. Building

the necessary infrastructures for collaborative projects such as SCOOP involves integrating

multiple Grid middleware packages to provide a holistic approach to collaborative problem

solving.

Portals have become a popular way to integrate applications and content, providing

groups of users (or virtual organizations) with a single entry point to interact with their ap-

plications, data, colleagues and services, all the while maintaining a consistent and uniform

interface. As new applications and technologies, such as Grid computing, become increas-

ing complex and difficult to configure and use, Grid portals have come to be recognized as

a useful tool to enable the work of scientists and engineers without burdening them with

the low-level details of underlying technologies.

1.1 Major Contributions

The Coastal Studies Institute and the Center for Computation & Technology (CCT) are

representing LSU in SCOOP program. As a research assistant at CCT, I worked closely

with other members of the SCOOP team at LSU. I designed and built a Grid-enabled sci-

ence portal for the coastal research community to reduce the complexity of grid computing

for end users and enable advanced application scenarios. The resulting SCOOP portal uses

new collaborative tools to better access ocean data and computational resources. The devel-

opment and deployment of the SCOOP portal also illustrate how portal technologies can

complement Grid middleware to provide a community with an easy-to-use collaborative

infrastructure that is tailored to their particular needs and has the ability to incrementally

Page 11: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

1.2. Outline 3

introduce and test new capabilities and services. My paper (3) based on this work gained

the best paper award in GCE05: Workshop on Grid Computing Portals at Supercomputing

Conference 2005.

Before designing the SCOOP portal, I performed an extensive investigation of the cur-

rent state of integration of Grid technologies with portal frameworks, cumulating in a re-

view paper (4) which discusses two of the major Grid portal solutions, the Open Grid

Computing Environments Collaboratory (OGCE) (5) and GridPortlets (6). That paper in-

vestigates and compares what each of these packages provides, discusses their advantages

and disadvantages, and identifies missing features vital for Grid portal development. Our

main purpose is to identify what current toolkits provide, reveal some of their limitations,

and provide motivation for the evolution of Grid portal solutions. That paper can also be

used as a reference for application groups to choose a Grid portal toolkit that fulfills the

needs of their application.

1.2 Outline

The rest of this report is organized as follows. Chapter 2 introduces Grid computing tech-

nologies and Grid portal toolkits. Chapter 3 lists several requirements from coastal mod-

eling researchers and describes the Grid infrastructure currently deployed to meet these

needs. Section 4 elaborates use cases from the coastal community, discusses the choice of

Grid portal toolkits and the design and architecture of the SCOOP portal, provides imple-

mentation details and information about the different services the portal provides, and also

looks to future development. Finally, Section 5 presents the conclusions of this work.

Page 12: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Chapter 2

Grid Portal Development

The term “the Grid” was coined in the mid-1990s to denote a proposed distributed com-

puting infrastructure to enable resource sharing within scientific collaborations. Much

progress has since been made on the construction of such an infrastructure and its ap-

plications to both scientific and industrial computing problems. However, current Grids

are generally difficult to use, so the development and deployment of Grid portals has been

a popular way to simplify the usage of Grid services. This chapter will introduce basic

concepts and technologies of Grid computing and toolkits to develop Grid portals.

2.1 Grid Computing

Grid computing is a form of distributed computing that addresses a need for coordinated

resource sharing and problem solving across multiple dynamic and geographically dis-

persed organizations. Resources include not only data, computer cycles, applications, and

networks, but also specialized scientific instruments, such as telescopes, ocean sensors,

and earthquake shake tables. The sharing is highly controlled, with resource providers

4

Page 13: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.1. Grid Computing 5

defining clearly and carefully what resources are shared, who can assess those resources,

and how those resources are used. Generally, Grids contains heterogeneous administrative

domains with different operating systems and hardware architectures. One of its distin-

guishing points from traditional distributed systems is that Grids aim to use standard, open,

general-purpose protocols and interfaces.

The technologies used to construct Grids has evolved over time. The Globus Al-

liance (7) released the Globus Toolkit 2 (GT2) in 1998. The Globus Toolkits thereafter

became “the de facto standard” for Grid computing. As the rapidly increasing uptake of

Grid technologies, the first Global Grid Forum (GGF) (8) was held as a formal standards

body in March 2001. Since then, GGF has produced numerous standards and specifications

documents, including the Open Grid Services Architecture (OGSA) (9). OGSA is based on

web services and provides a well-defined suite of standard interfaces and behaviours that

serve as a common framework for all Grid-enabled systems and applications. The Globus

Toolkit 4 (GT4) (10) has implemented OGSA and other GGF-defined protocols to provide

functionalities of resource management, information services, security services, and data

management. In addition, a number of tools function along with Globus Toolkit to make

Grids a more robust platform, such as Gridbus (11), Grid Portal toolkits, Condor (12; 13),

MPICH-G2 (14), etc.

Grid technologies have been applied on a range of problems from both academia and in-

dustry. These applications cover compute-intensive, data-intensive, and equipment-centered

scenarios. Locally, we can these applications from projects in the Center for Computation

& Technology at Louisiana State University. The UCoMS (15) project is using Grid tech-

nologies to advance reservoir simulation and drilling analysis studies, coordinating the use

of large scale compute and data resources at LSU and University of Louisiana at Lafayette.

The GridChem (16) project is building a production system based on Grid infrastructures

Page 14: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 6

for Chemists to launch and monitor computational chemistry calculations on CCG super-

computers from remote sites. To support Grid applications in astrophysics, coastal mod-

eling, and petroleum engineering, the Enlightened (17) project focuses on developing dy-

namic, adaptive, coordinated and optimized use of networks connecting geographically

distributed high-end computing resources and specific scientific instrumentation.

2.2 Grid Portal Toolkits

The Web has proven to be an effective way of integrating and delivering content and ap-

plications. Web portals, like Yahoo or MSN, offer users a single point to access various

information, services, and applications. The Grid is still a complex distributed system that

has, at its roots, many differing software packages that form underlying Grid-middleware

and infrastructure. Grid portals are gaining momentum in the scientific research commu-

nity as a possible way to expose Grid services and functionality in a friendly, easy-to-use,

and collaborative interface.

When designing a Grid portal for a particular application domain, developers often

have the choice of developing a new solution from the ground-up or leveraging an existing

toolkit. Although building a solution from scratch may prove successful in the short-term,

as technologies evolve and the demands on portal applications become more intense, more

sophisticated solutions are needed. Grid portal toolkits are increasingly being adopted as

a means to speed application development since they can provide much of the high-level

functionality that is needed to manage the multi-institutional and multi-virtual organization

issues that arise when designing and maintaining a production portal.

A defining moment in the evolution of portal development toolkits occurred in Oc-

tober 2003 with the introduction of JSR-168 (18), an industry led effort to specify how

Page 15: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 7

components within Java-based portals, called portlets, should interact with their hosting

environment, or container. By clearly defining the methods a portlet must implement to

be standards compliant, JSR-168 is the first step towards allowing true interoperability be-

tween portlets developed in different portlet containers. A true JSR-168-compliant portlet

that does not have any container-related dependencies will be able to run, without any

modification, in any number of portal servers including IBM WebSphere, BEA WebLogic,

uPortal, and GridSphere (19). In such a case, the container becomes irrelevant in choosing

a Grid portal toolkit, so long as it is portable.

There are a number of Grid portal toolkits including GridPortlets , GridPort Toolkit (20),

and OGCE. Since the GridPort Toolkit, initially started in the TACC, is now gradually in-

tegrated into the OGCE, I will mainly introduce GridPortlets and OGCE in the following

subsections.

2.2.1 GridPortlets

GridPortlets is an open-source toolkit that was developed at the Albert Einstein Institute

as part of the European GridLab (21) project’s GridSphere workpackage. GridPortlets

provides a high-level interface for portal developers to access a range of Grid services

including resource information, credential management, job submission, and file browsing.

Figure 2.1 shows the general architecture of GridPortlets, where a common and clearly

defined API abstracts developers from underlying services and middleware. For example,

to launch an executable using GridPortlets, a simple execute task is constructed that does

not require any details about the underlying implementation, which may be the Globus

Resource Allocation Manager (GRAM) (22) via the Java Commodity Grid Toolkit (Java

CoG) (23) or an advanced resource brokering system (24) such as the GridLab Resource

Page 16: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 8

Figure 2.1: GridPortlet Architecture.

Management System (GRMS) (25).

In addition to providing a high-level API for Grid operations, GridPortlets contains

many reusable User Interface (UI) components that can easily be exploited to develop other

portlet-based applications. These UI components allow developers to customize the generic

JSP pages used by GridPortlets and incorporate them into their applications. GridPortlets

itself reuses many of these components for its various portlets, including the file browsing

dialog and resource information interfaces.

Portal services in GridPortlets are managed through a Service Registry that allows de-

velopers to “plug in” new implementations to the existing framework. Both services and

portlets in GridPortlets can leverage other services to build more sophisticated applications

and share data. For example the Job Submission Portlet accesses the MyProxy (26) service

to determine if a user has a valid credential, and if not, refers the user to the MyProxy

portlet to retrieve a valid proxy.

Page 17: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 9

GridPortlets is packaged with five generic JSR-168 compliant portlets that can be used

without any additional development:

• Resource Registry Portlet - Administrators can manage the set of resources their Grid

portal makes available to its users.

• Resource Browser Portlet - Users can browse the resources to which they have ac-

cess, including the services, job queues, and accounts that are available on remote

computers.

• Credential Retrieval Portlet - Users can retrieve credentials from a MyProxy server

and gain single sign-on access to both the portal and Grid resources.

• Job Submission Portlet - Users can submit and manage jobs on their resources us-

ing Globus. Resource brokering systems are also supported through the same API,

although currently only GRMS has been implemented.

• File Browser Portlet - Users can browse and manage physical files on Grid resources.

Logical files registered with a logical file service can be accessed if a service is

available.

2.2.2 OGCE

The OGCE project was established in Fall 2003 with funding from the National Science

Foundation Middleware Initiative. OGCE is a open-source collaborative project that lever-

ages Grid portal research and development from the University of Chicago, Argonne Na-

tional Laboratory, Indiana University, the University of Michigan, the National Center for

Supercomputing Applications, the San Diego State University, and the Texas Advanced

Computing Center.

Page 18: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 10

Figure 2.2: OGCE 2 Architecture

The basis of the OGCE architecture, as shown in Figure 2.2, is pluggable components

in the form of services and portlets. OGCE uses the Java CoG as its main service API for

accessing Grid resources. GridPort’s Grid Portal Information Repository (GPIR) is used to

retrieve local resource information, such as machine loads and queue status.

OGCE comes packaged with services to support Grid operations such as workflow and

queue management with the Open GCE Runtime Engine (OGRE) (27) and the Berkeley-

Illinois-Maryland Association (BIMA) QueueManager. OGCE provides container-independent

mechanisms that allow portlets to share data. For example, the MyProxy Manager allows

other portlets accessing Grid resources to use credentials retrieved by the MyProxy portlet.

The OGCE team delivered OGCE release 2 (OGCE 2), whose portlets are compliant to

JSR-168 and can currently be deployed into either GridSphere or uPortal. The following

are the core Grid portlets in OGCE 2:

• Proxy Management Portlet - Enables users to retrieve credential proxies from a

MyProxy server specified by the user and allows them to remove retrieved credential

Page 19: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 11

proxies.

• Job Submission Portlet - Provides a simple user interface for submitting and tracking

jobs on a Globus-based Grid.

• File Transfer Portlet - Uses GridFTP (28) for managing files among Grid machines,

supporting both uploading and downloading files.

• GPIR Browser Portlet - Allows users to browse Grid and portal-related resource in-

formation that has been aggregated and cached by the GPIR web-service. Resource

data is categorized into compute resources, storage resources, and visualization re-

sources.

• OGRE Events Viewer Portlet - Allows users to monitor OGRE events on a specified

server. Using OGRE enables users to write workflow scripts to execute a flow of

controlled tasks.

• BIMA Queue Viewer Portlet - Allows users to monitor queue status and status of

jobs in queues on a specified server using BIMA. The BIMA QueueManager is a

Java application that supports the execution of multi-stage jobs, where individual

jobs may be dispatched to a Globus-based Grid through the Java CoG.

• Viscosity Portlet - Allows users to access the central data repository developed by

the Network for Earthquake Engineering Simulation (NEES) to store or retrieve files

and associated metadata.

2.2.3 Comparison between GridPortlets and OGCE

Grid-middleware and portal technologies are constantly evolving as new technologies are

developed and more stakeholders become involved. It is only natural in this evolution that

Page 20: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 12

Table 2.1: Comparison between GridPortlets and OGCE 2

Features GridPortlets OGCE 2Service API a uniform and consistent high-

level portal service APIa heterogeneous set of serviceAPIs

Grid Mid-dleware

GT2, GT3, GT4, and GridLabmiddleware

GT2, GT3, and GT4

PersistentLayer

Using hibernate to persisting in-formation about resources andjobs

Not known

PresentationLayer

JSP and a UI component model JSP and Velocity

Core GridPortlets

Credential Management, Re-source Information Provider, JobManagement (Support GRAMand GRMS), File Management

Credential Management, Re-source Information Provider, JobManagement (Support GRAMand Condor), File Management

Portability Compliant to JSR 168, but onlyready for GridSphere

Compliant to JSR 168 and readyfor GridSphere and uPortal

the packages supporting the Grid also will be rapidly changing. Therefore, I focuses on

the functionality provided by GridPortlets v1.0 and OGCE 2 RC 4. This section gives a

relatively brief comparison between them and further details can be referred to our review

paper.

Table 2.2.3 summarizes the comparison of main features of Grid portal toolkits. Both

GridPortlets and OGCE provide information services, job and file management services,

and authentication services for portal developers. While overlapping in the basic function-

alities of Grid services, GridPortlets and OGCE differ greatly in the service APIs they

provide for developers. One key feature of GridPortlets is that it defines a single and

consistent high-level API between portlets and underlying Grid services. This uniform

API abstracts developers from underlying Grid technologies and infrastructure while sup-

porting multiple implementations. The API can be implemented with Grid programming

Page 21: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 13

frameworks, such as Java CoG and Java Grid Application Toolkit (GAT) (29), or with Grid

middleware toolkits, such as Globus Toolkit and GridLab middleware services. Developers

choose the implementations of particular services during portal configuration or in some

cases users are given the choice of which service to use. The current implementation of

GridPortlets API support GT2, GT3, and GT4 through Java CoG 1.1, and the Grid middle-

ware services developed in the GridLab project. It uses the GridSphere persistence layer,

an object/relational API that is implemented with Hibernate, to store job and resource in-

formation. By contrast, OGCE provides a heterogeneous set of service APIs, aggregated

mainly from other projects, such as the Java CoG and GridPort. For job submission and

physical file management, Java CoG provides a uniform API that abstracts developers from

underlying implementations, including Condor, SSH, and different versions of the Globus

Toolkit.

GridPortlets and OGCE both use JavaServer Pages (JSP) for generating web presen-

tation. Additionally, GridPortlets UI component model provides a number of reusable

JSP components, including file browsing and job submission components. These reusable

UI components facilitate developers in building interactive, friendly web interfaces. Al-

though OGCE does not supply reusable JSP UI components, it provides tools to support

for Velocity-based portlets, which may help developers to port Jetspeed-based portlets into

JSR-168 containers.

GridPortlets and OGCE both offer Grid-related portlets that can be integrated into Grid

portals without additional programming, including credential management, resource infor-

mation browser, job management, and file management. Their functionalities are similar

and main differences are on user interface and some minor features.

Since GridPorlets’ services are based on the service framework of GridSphere, it is

not completely independent and, as a result, portlets based on GridPortlets, even though

Page 22: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

2.2. Grid Portal Toolkits 14

technically compliant with JSR-168, can not easily be deployed into portlet containers

other than GridSphere. By contrast, OGCE provides container-independent services and

its portlets can be deployed in either GridSphere or uPortal. In addition, OGCE includes a

suite of HttpUnit tests for its Grid-related portlets, which can be used to verify the OGCE

installation in a portlet container.

Page 23: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Chapter 3

Collaborative Coastal Modeling

3.1 Goals

SCOOP is working towards a integrated and coordinated coastal ocean observation and

prediction system, leveraging emerging regional efforts and cutting edge information tech-

nologies. It aims to integrate distributed real-time ocean observing stations and regional

coastal modeling entities, to run ensembles of numerical hydrodynamic models for the pre-

diction, verification and visualization of critical storm surge and wave behaviour during

severe storms and hurricanes. SCOOP is addressing several needs in reaching their goals

including:

• Ubiquitous and easy access to data of all types, including sensor, satellite, model and

visualization data.

• Automated deployment of models across heterogeneous resources, including com-

plex workflows and ensembles.

• Creation of data standards and interoperability of model codes.

15

Page 24: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

3.2. Science Scenarios 16

• Capabilities for coupled and multi-scale models.

• Operational procedures which can provide GIS visualization and notification to emer-

gency management personnel.

Building an infrastructure to meet these needs and supply timely information about se-

vere events requires attention to reliability, fault tolerance, scheduling, as well as end user

presentation and interaction.

3.2 Science Scenarios

The current SCOOP members is a combination of research institutions, university programs

and national agencies, including Louisiana State University, University of Alabama at

Huntsville, Texas A&M University, GoMOOS, University of Florida, University of North

Carolina at Chapel Hill, the National Oceanic and Atmospheric Administration (NOAA),

University of Maryland, University of Miami, and Virginia Institute of Marine Science.

This myriad collaboration engages researchers with diverse skill sets and varying degrees

of technical expertise. One motivating collaborative scenario (3) for SCOOP is the follow-

ing:

An evolving hurricane in the tracking region begins the complex process of predict-

ing and validating a hurricane path and its corresponding hydrodynamic impacts. Feeds

from the National Hurricane Center trigger analytical models at the University of Florida

which generate appropriate wind fields. Additionally atmospheric models such as No-

GAPS, COAMPS, MM5, NCEP-NAM among others, are pushed by diverse modeling en-

tities in a non-deterministic manner. A brief description of these models can be found at

the National Hurricane Center page (30). These winds are pushed into a SCOOP Storage

Page 25: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

3.2. Science Scenarios 17

archive using automated mechanisms as they become available.

Researchers from across the Nation, alerted to the impending event by notifications

sent via Email, SMS and IM, authenticate to the SCOOP portal from wherever they are

located. From an Ensemble Model interface, they query for all possible atmospheric mod-

els available for the corresponding temporal specifications, in addition to the analytical

winds. Based on the query results a matrix of possible hydrodynamic models vs. the avail-

able atmospheric datasets is generated. For datasets from atmospheric models that have

yet to arrive in the archive watchdogs are deployed tagged with the associated coupled

ocean modeling workflow.

The portal interface provides researchers with the ability to configure models as needed,

and prioritize the order in which they should be run. The matrix of models and currently

available datasets is then used to automatically stage data and schedule the ensemble of

hydrodynamic models on the SCOOP Grid, comprised of machines across the region. As

results become available, notifications are dispatched to collaborators and output data are

pushed back into the archive.

GIS driven visualization services (31) allow the modelers and end users to analyze

results using interfaces that provide an overlay of results obtained from all the models.

Interfaces pinpoint the location of available sensors allowing modelers to compare the

model ensemble with realtime data from Sensor Stations. Such correlated results compar-

ison which allows for model data validation and verification results in improved forecasts

of storm surge and inundation displayed using context relevant maps eg. street overlays or

artificial levee diversion structures.

Page 26: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

3.3. CCT Role 18

3.3 CCT Role

The SCOOP group at CCT is contributing to the deployment of a cyber infrastructure for

SCOOP. Our role includes:

1. Surveying the coastal research community to identify hardware and software require-

ments

2. Providing a data archive (32) for storing observational data from satellite and buoy

sources and results from coastal and atmospheric model simulations.

3. Designing a Grid testbed, the SCOOP Grid, for model deployment and data archive.

4. Building a Grid-enabled science portal to provide the coastal research community

with easy access to data, models, and resources in the SCOOP Grid.

5. Providing a command-line toolkit (33) for accessing the data archive

6. Showing prototype examples of ensemble scenarios where multiple wind input data

can be automatically located and fed into multiple wave or surge models and the

resulting data will be staged to the data archive.

The CCT SCOOP team includes Gabrielle Allen (Lead), Jon MacLaren (Manager and

Data Archive), Andrei Hutanu (Visualization), Ian Kelley (GridSphere), Chirag Dekate

(Models and SCOOP Grid), Chongjie Zhang (SCOOP Portal), Dayong Huang (Data clients),

Zhou Lei (Grid Application Toolkit), Archit Kulshrestha (Condor), Sasanka Madiraju (SCOOP

Grid) and Edward Jerome Tate (Visualization).

Page 27: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

3.4. Infrastructure 19

3.4 Infrastructure

Currently, coastal researchers typically access data from multiple sources (e.g. wind fields

from NCEP or USGODAE, hurricane tracks from NHC, observation data from coastal ob-

servatories like WAVCIS or SEACOOS) using HTTP, FTP or more recently the LDM (34)

protocols. Operational workflows are deployed using “cron” type scripts, which are hard

to adapt to address unreliable file arrival or fault tolerance. Usually the models involved

in SCOOP (e.g. ADCIRC, WWIII, SWAN, WAM, CH3D, ELCIRC) are run only at local

sites, and may require many different configuration and input files. The various institutions

deploy their own web servers, delivering results at different times in varying data formats

and data descriptions.

Activities in SCOOP and other projects are addressing the complexity of dealing with

different data sources and formats. A prime need is to develop data standards to facil-

itate sharing and collaboration. In lieu of a formal standard, SCOOP has developed a

file-naming convention throughout the project that encodes enough information to serve

as primary metadata. LSU have established an advanced Grid-enabled data storage archive

service, providing essential features such as digestion and retrieval of data via multiple pro-

tocols (including GridFTP and HTTP), a logical file catalog, and general event-based noti-

fication. Compute resources for researchers are available in the form of the SCOOP Grid

which is comprised of resources distributed across multiple institutions (Louisiana State

University, University of North Carolina, MCNC, University of Florida, Virginia Institute

of Marine Sciences). Basic Grid middleware such as Globus Toolkit 3 (GT3) is deployed

across the resources. Condor is deployed across the LSU-SCOOP Grid for prototyping

scheduling and job management scenarios. Figure 3.1 shows the current Grid infrastruc-

ture we deployed for SCOOP. An ongoing goal is to be able to coordinate the deployment

Page 28: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

3.4. Infrastructure 20

Figure 3.1: SCOOP Grid Infrastructure

and scheduling of operational and research simulations across the SCOOP Grid.

Page 29: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Chapter 4

SCOOP Portal

The SCOOP Portal provides the SCOOP user community with a centralized gateway mech-

anism to submit and manage coastal model simulations and keep track of a large amount

of data files. To better understand portal requirements from the community, we worked

closely with coastal researchers and developed use-case scenarios which have driven the

design of the portal.

4.1 Requirements

After consulting and discussing with the SCOOP coastal researchers and modelers, we

identified different requirements for the SCOOP portal development. One notable require-

ment was that although the scientists wanted to restrict access to data to those in the collab-

oration (for one reason, to address potential problems with casual interpretation of severe

storm data), no finer grained access control was required. Additionally, all machines in

the SCOOP Grid are shared resources among scientists, simplifying authorization needs

considerably. In implementing this first version of the SCOOP portal, we concentrated on

21

Page 30: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.1. Requirements 22

the following two user scenarios.

Archive Access

The SCOOP project has set up an archive service to store the source atmospheric data, the

wave/surge data generated by model simulations, and also other observed data that might be

used to verify the model accuracy. The SCOOP Portal is required to provide functionality

to facilitate modelers and researchers in querying and retrieving datasets. The steps for

accessing data files are as follows:

1. A user selects a class of data and specifies corresponding metadata to query a meta-

data catalog service to discover datafiles of interest, e.g. “Output datafiles from AD-

CIRC model simulations performed at Louisiana State University for the Gulf of

Mexico region, during August 2005”. A list of Logical File Names (LFNs) are re-

turned from the query to the user.

2. The user can select one or more LFNs of interest, and then the portal contacts the

archive’s logical file service to return the physical file locations to the user.

3. The users can choose either to download the data file to the local machine or to

perform a third-party transfer via a range of protocols.

Model Simulations

One of scientific objectives of SCOOP is to run an ensemble of hydrodynamic models

driven by input conditions from a range of different atmospheric models. The steps for

running hydrodynamic model simulations are as follows:

1. A user is required to retrieve a proxy credential to authenticate to Grid resources.

Page 31: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.2. Choice of Grid Portal Toolkits 23

2. The user specifies metadata describing atmospheric data and a hydrodynamic model.

The SCOOP Portal contacts a metadata catalog service with specified metadata and

the archive’s logical file service to get atmospheric data files of interest. Based on

the data files, the SCOOP Portal constructs a list of possible simulations, depending

on the available input files in the archive. Each of these simulations will then be

submitted to a job scheduler.

3. The user then can track the progress of each simulations via the SCOOP Portal or

use the portal’s notification services, which include AIM and email.

4. Upon successful completion of each simulation the results are pushed into the archive

for dissemination and further processing.

4.2 Choice of Grid Portal Toolkits

The requirements from the SCOOP community are still evolving. The use of a mature por-

tal framework and a well-designed Grid portal toolkit was necessary to be able to focus on

the business logic of the SCOOP use-case scenarios and to allow for extensibility to future

requirements. Based on our comparative analysis of GridPortlets and OGCE, we chose

GridSphere and GridPortlets as our main toolkits to speed up the process of developing

and deploying an application portal for SCOOP modelers and researchers.

GridSphere is a free, open-source portal framework developed by the European Grid-

Lab project, which focused on developing Grid application tools and middleware. Grid-

Sphere provides a well documented set of functionality, including portlet management,

user management, layout management, and role-based access control. Its portlet-based

architecture offers flexibility and extensibility for portal development and facilitates soft-

Page 32: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.3. Design 24

ware component sharing and code reuse. GridSphere is compliant with the JSR-168 port-

let specification which allows portlets to be developed independently of a specific portal

framework. GridSphere’s portlet service model provides developers with a way to encap-

sulate reusable business logic into services that may be shared between many portlets.

The advantages of using GridSphere come not only from its core functionalities, but

also from its accompanying Grid portal toolkit, GridPortlets. GridPortlets abstracts the de-

tails of underlying Grid technologies and offers a consistent and uniform high-level service

API, enabling developers to easily create custom Grid portal web-applications. The Grid-

Portlets services provide functionalities for managing proxy credentials, resources, jobs,

and remote files, and supports persisting information about credentials, resources, and jobs

submitted by users. The GridPortlets service API currently supports both GT2 and GT3.

In addition, GridPortlets delivers five well-designed, easy-to-use portlets, which include:

resource registry, resource browser, credential management, job submission, and file man-

agement.

4.3 Design

The architecture of the SCOOP Portal is based on the GridSphere framework. Figure 4.1

shows the SCOOP Portal software components and their interactions and relationships.

From the simplified diagram, it can seen that SCOOP portlets use SCOOP services for

application-specific functionality and business logic. The SCOOP Portal services them-

selves use or extend services built into the GridSphere framework, the GridPortlets pack-

age, as well as some third-party packages. For example, the SCOOP portal services mainly

use the GridPortlet service API to interact with Grid resources, such as submitting jobs

and moving remote files. Most portlets are independent from one-another, however, they

Page 33: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 25

GSI, RLS, GridFTP

Computing Resources

End User 1

Archival Storage

Archive

SCOOP Portal Services

SCOOP Job Management

Download Tracking

Request Tracking

Credential Management

GridPortlets ServicesFile

ManagementResource Browser

GSI, GRAM, GridFTP, MDS

End User n......SCOOP Portal (based on GridSphere)

Archive PortletModel Portlet Resource Browser PortletCredential Portlet

Job Management

Request Portlet ...

Java CoG API GridLab Service API(iGrid, GRMS, File Service, ...)GT2 GT3

Grid Middleware

End User 2

Figure 4.1: SCOOP Portal architecture based on GridSphere.

can communicate with each other via the service layer. For example, the credential portlet

calls the credential service to retrieve proxy credentials from a MyProxy server, and later a

job submission portlet can get the retrieved credentials to authenticate with Grid resources.

This portlet-based, service-oriented, architecture greatly speeds up portal development and

exhibits high extensibility.

4.4 Implementation

The SCOOP Portal is implemented with GridSphere version 2.0.3 and GridPortlet version

1.0, which together provide a coherent core set of relevant functionality. When evaluating

the SCOOP community requirements, we found several common functionalities had al-

ready been implemented in other Grid portal projects. To avoid reinventing these, the fully

Page 34: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 26

deployed SCOOP Portal contains not only portlets developed specifically for the SCOOP,

but also shared portlets from GridSphere, GridPortlets, and GridLab Testbed (35). The

following list illustrates the functionalities of portlets and services that are specific to the

SCOOP project:

• Archive – enables users to retrieve, either by HTTP to the local machine or GridFTP

to a remote machine, SCOOP data files from archive storage. The interfaces pro-

vides queries using metadata, and custom interfaces to specific data formats such as

OpenDAP.

• SCOOP Job Submission – provides custom interfaces and options for users to launch

coastal models. The interface matches models to available data files in the archive.

• Simulation Tracking and Notification – allows users to track the progress of active

simulations via the SCOOP Portal and to receive notification via email or AIM.

• Request Tracking – coordinates team work by allowing users to manage and track

the status of tasks and defects in various SCOOP sub-projects.

• Download Tracking – tracks downloads of software tools distributed through the

portal.

The following list illustrates the Grid-related functionalities of portlets and services that

are deployed to the SCOOP Portal but developed by other Grid projects. All other portlets

are from the GridPortlets distribution except the Grid Resource Status Monitoring portlet

that is developed by the GridLab project.

• Credential Management – enables users to retrieve, renew, and delete proxy creden-

tials from a MyProxy server.

Page 35: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 27

• Resource Registry – enable portal administrators to register or unregister Grid re-

sources, such as computing resources or services.

• Resouce Browser – enable users to view available Grid resources, including infor-

mation about hardware configuration, services, job queues, and accounts on remote

machines.

• Grid Resource Status Monitoring – enable users to view the information about whether

particular services and software components are installed and available on each ma-

chine, and the possible reasons for the services that are not available.

• Physical File Management – enable users to browse and manage files on remote

machines.

4.4.1 Archive

The current archive storage contains three classes of data files: source atmospheric data,

simulated wave/surge data, and other observed data to verify the model accuracy. The three

class of data are associated with class specific metadata attributes and query interfaces.

Figure 4.2 shows the archive portlet for querying for simulated wave/surge data files.

The archive portlet gathers metadata information from user requests and retrieves a list

of logical file names which have matching metadata from the archive portal service. Cur-

rently, SCOOP does not have an appropriate metadata service, the logical file name used in

queries is generated from metadata information contained in the SCOOP file-naming con-

vention. The logical file name may contain wildcard characters to accommodate unknown

metadata or to select a range of files. The mappings of physical file names and logical file

names can be obtained by querying the archive’s logical file services, currently provided

Page 36: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 28

Figure 4.2: Archive portlet using metadata for querying and retrieving SCOOP data files fromstorage.

Page 37: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 29

Figure 4.3: Copy retrieved data files to a remote machine.

Page 38: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 30

Figure 4.4: Manage files through physical file management portlet

by an instance of Globus Replica Location Service (RLS) version 2.2 (36). To provide

performance scalability, the query results are shown with dynamically paging techniques,

because the RLS API does not support merely returning the size of matched results. Users

can retrieve SCOOP data files via HTTPS to their local machine, or perform a third-party

transfer via GridFTP. The transfer service is built on the GridPortlet file services and re-

source services for directory selection and file copy. Figure 4.3 shows the user interface to

allow users to specify a location on a remote machine when copying retrieved data files.

Users can manage copied files through physical file management portlet, as shown in Fig-

ure 4.4.

Currently the logical file entries in the RLS point to the physical file entries available

locally. Efforts are underway to incorporate distributed storage resources including SCOOP

data store instances at TAMU. The LSU SCOOP archive is also being expanded to leverage

Page 39: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 31

SRB based terabyte scale storage at SDSC. Evolving versions of the SCOOP archive access

portlet will address issues of accessing federated data stores.

4.4.2 SCOOP Job Management

From the user scenarios, SCOOP models can run on different input data types (e.g. using

wind data generated either by analytic means or by other models). As shown in Figure 4.5,

the SCOOP job submission portlet allows users to select multiple different types of wind

data for a particular model. Using specified metadata, the job submission portlet queries

the logical file service and generates a file list for each selected data type, and constructs a

list of such tasks. The SCOOP job service submits each task to Condor via Globus GRAM

to run the model on each file list. Hence one SCOOP simulation job may contain several

sub-job runs. The SCOOP job submission is mainly built on the GridPortlet service API.

To store custom job information and the parent-child job hierarchy, we provide a persistent

layer for SCOOP job management using Hibernate. Later, we can track the job information

via this portlet, as shown in Figure 4.6.

Active proxy credential is required for job submission. Users need to delegate their

Grid credentials to the MyProxy server. The SCOOP Portal allows users to retrieve cre-

dentials from a MyProxy server via GridPortlets’ credential management portlet. The job

submission service will automatically use retrieved credentials for authenticating with Grid

resources.

The SCOOP job portlet allows users to view the status and output of each sub-job

run, as shown in Figure 4.7. Notification of the job status, currently by AIM or email,

is implemented by another service: Simulation Tracking and Notification. Each sub-job

registers itself with and continuously sends updated information to the notification service

Page 40: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 32

Figure 4.5: Job submission portlet for running ensembles of simulation models on different typesof data.

Page 41: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 33

Figure 4.6: Job submission portlet for running ensembles of simulation models on different typesof data.

Page 42: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 34

Figure 4.7: Show output of each sub-job run.

Page 43: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

4.4. Implementation 35

via XML-RPC. The notification service collects and sends out this information via email

or AIM, depending on the user preference.

Currently, the functionality of the SCOOP job management interfaces is limited by the

lack of interoperability of the underlying models and data sources. As ongoing work is

completed, more complex workflows and ensemble runs will be implemented.

Page 44: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

Chapter 5

Conclusion

This report has investigated state-of-the-art Grid portal toolkits represented by GridPortlets

and OGCE, and described the background, design and implementation of a Grid-enabled

science portal for the SCOOP coastal ocean observing and modeling community. The

SCOOP portal, built with the GridSphere Framework, currently integrates customized Grid

portlet components for data access, job submission, resource management and notification,

and provides researchers and modelers with easy access to integrated data and compute

resources.

While the portal interfaces have thus far been well received by the SCOOP community,

the challenge now is to make the portal an essential part of the scientists’ usual working

environment. This requires adding new scenario and usecase driven features to the portal.

These enhancements will include:

• advanced ensemble simulation interfaces to allow modelers to run a spectrum of

hydrodynamic models each with different input conditions

• integration of GIS technologies into the portal to provide geo-referenced interactive

36

Page 45: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

37

interfaces and allow end-users to compare ensembles using time series graphs

• metadata catalog services to provide more powerful data retrieval

• a new generic service able to inform coastal modelers of impending hurricanes, trig-

ger coastal modeling workflow, and provide status updates from triggered workflows.

In addition to functionalities, several other issues also need attentions: firstly, most sci-

entists find it is hard to deal with Grid credentials. For instance, our current implementation

of the SCOOP portal assumes that a valid credential is held on a MyProxy server, which

requires client software to be installed on local machines and of course procedures and poli-

cies for issuing Grid certificates. Projects such as PURSE (37) and GAMA (38; 39) are

developing mechanisms for authenticating solely through a portal. Incorporating authenti-

cation mechanisms inspired by such efforts, would go a long way to improve the usability

and adoption of portal frameworks as versatile interfaces to utilize distributed resources.

In general, the development of portlets for new model scenarios needs to be simple

enough so that computer savvy coastal modelers are able to customize and produce portlets

that cater to their demands. Finally, a full range of Grid services should be easily accessible

through the portal, the GridPortlet API is a step in this direction, and the GGF SAGA

working group is developing a general API for application oriented access to Grid services.

In overall, the development of the SCOOP Portal has shown how the integration of

Grid technologies with portal frameworks can provide a community with new collaborative

tools to better access data, resources and information, with the end goal of enabling better

science and easier information dissemination. It may become a real example for other

science projects to adopt portal technologies as a means of exposing Grid services and

building collaborative environments.

Page 46: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

BIBLIOGRAPHY 38

Bibliography

[1] SURA Coastal Ocean Observing Program (SCOOP). February 20, 2006,

http://scoop.sura.org/.

[2] I. Foster, C. Kesselman, S. Tuecke. “The Anatomy of the Grid: Enabling Scalable

Virtual Organizations”. International J. Supercomputer Applications, 15(3), 2001.

[3] Chongjie Zhang, Chirag Dekate, Gabrielle Allen, Ian Kelley, Jon MacLaren. “An

Application Portal for Collaborative Coastal Modeling”, to appear in the special is-

sue GCE05 of Concurrency and Computation: Practice and Experience, 2006. [Best

Paper Award in GCE05: Workshop on Grid Computing Portals at Supercomputing

Conference 2005]

[4] Chongjie Zhang, Ian Kelley, Gabrielle Allen. “Grid Portal Solutions: A Comparison

of GridPortlets and OGCE”, to appear in the special issue GCE05 of Concurrency

and Computation: Practice and Experience, 2006.

[5] Open Grid Computing Environments Collaboratory. http://www.ogce.org/, cited

in May 2005.

[6] Michael Russell, Jason Novotny, Oliver Wehrens. “The Grid

Portlets Web Application: A Grid Portal Framework”. March 2006.

http://www.gridsphere.org/gridsphere/html/publications/GridPortlets.pdf

[7] The Globus Alliance Home Page. February 20, 2006, http://www.globus.org.

[8] The Global Grid Forum Home Page. February 20, 2006, http://www.ggf.org.

[9] The Open Grid Services Architecture Working Group Home Page. February 20, 2006

https://forge.gridforum.org/projects/ogsa-wg.

Page 47: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

BIBLIOGRAPHY 39

[10] I. Foster. “Globus Toolkit Version 4: Software for Service-Oriented Systems. IFIP In-

ternational Conference on Network and Parallel Computing, Springer-Verlag LNCS

3779, pp 2-13, 2005.

[11] The Gridbus Project Home Page. February 20, 2006, http://www.gridbus.org/

[12] D. Thain, T. Tannenbaum, and M. Livny. “Condor and the Grid”, in Grid Computing:

Making The Global Infrastructure a Reality, edited by F. Berman and G. Fox and T.

Hey. John Wiley, 2002.

[13] Douglas Thain, Todd Tannenbaum, Miron Livny. “Condor and the Grid”. In Grid

Computing, John Wiley & Sons, May 2003

[14] N. Karonis, B. Toonen, and I. Foster. “MPICH-G2: A Grid-Enabled Implementation

of the Message Passing Interface”. Journal of Parallel and Distributed Computing,

2003.

[15] The UCoMS Project Home Page. February, 2006, http://www.ucoms.org/.

[16] R. Dooley, K. Milfield, C. Guiang, S. Parmidighantum, G. Allen. “From Proposal to

Production: Lessons Learned Developing the Computational Chemistry Grid Cyber-

infrastructure”. Journal of Grid Computing. [To be published in International Journal

of Grid Computing]

[17] The Enlightened Project Home Page, March 2006,

http://www.enlightenedcomputing.org.

[18] The Java Community Process. “JSR 168: Portlet Specification v1.0”. 2003.

http://www.jcp.org/en/jsr/detail?id=168.

Page 48: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

BIBLIOGRAPHY 40

[19] J. Novotny, M. Russell, and O. Wehrens. “GridSphere: A Portal Framework for Build-

ing Collaborations”, in Proceedings of 1st International Workshop on Middleware for

Grid Computing, Rio de Janeiro, 2003.

[20] M. Dahan, M. Thomas, E. Roberts, A. Seth, T. Urban, D. Walling, J.R. Boisseau.

“Grid Portal Toolkit 3.0 (GridPort)”, in Proceedings. 13th IEEE International Sym-

posium on High performance Distributed Computing, 4-6, pp.272 - 273, June 2004

[21] GridLab: A Grid Application Toolkit and Testbed Project Home Page. August 12,

2005. http://www.gridlab.org.

[22] Globus Grid Resource Allocation and Management (GRAM), The Globus Alliance.

May 2005. http://www.globus.org/grid software/computation/gram.php

[23] Gregor von Laszewski, Ian Foster, Jarek Gawor, and Peter Lane. “A Java Commodity

Grid Kit”, in Concurrency and Computation: Practice and Experience, vol. 13, no.

8-9, pp. 643-662, 2001

[24] Jarek Nabrzyski, Jennifer M. Schopf, and Jan Weglarz. Grid Resource Management:

State of the Art and Future Trends. Kluwer Publishing, Fall 2003

[25] J Brzezinski, J Nabrzyski, J Puckacki, T Piontek, et al. Technical Specification of the

GridLab Resource Management System, July 2002.

[26] J. Basney, M. Humphrey, and V. Welch. “The MyProxy Online Credential Reposi-

tory”. In Software: Practice and Experience, 2005

[27] Open GCE (Grid Computing Environment) Runtime Engine.

http://corvo.ncsa.uiuc.edu/ncsa-tools/ncsa-tools-ogre-2.1.0/manual/

Page 49: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

BIBLIOGRAPHY 41

[28] W. Allcock, Editor, “GridFTP: Protocol Extensions to FTP for

the Grid”, Global Grid Forum Draft Standard (April 2003),

http://www-isd.fnal.gov/gridftp-wg/draft/GridFTPRev2.pdf.

[29] G. Allen, K. Davis, T. Goodale, A. Hutanu, H. Kaiser, T. Kielmann, A. Merzky,

R. Van Nieuwpoort, A. Reinefeld, F. Schintke, T. Schuett, E. Seidel and B. Ullmer,

The Grid Application Toolkit: Toward Generic and Easy Application Programming

Interfaces for the Grid, Proceedings of the IEEE, 93(3), 2005.

[30] Description of Atmospheric Models. NOAA National Hurricane Center. February 20,

2006. http://www.nhc.noaa.gov/aboutmodels.shtml

[31] Gabrielle Allen, Philip Bogden, Gerald Creager, Chirag Dekate, Carola Jesch, Hart-

mut Kaiser, Jon MacLaren, Will Perrie, Gregory Stone, Xiongping Zhang, “GIS and

integrated coastal ocean forecasting”, submitted to Concurrency and Computation:

Practice and Experience, 2006.

[32] J. MacLaren et al.. “Shelter from the Storm: Building a Safe Archive in a Hostile

World”, to appear in Proceedings of the The Second International Workshop on Grid

Computing and its Application to Data Analysis (GADA’05), 2005.

[33] Dayong Huang, Gabrielle Allen, Chirag Dekate, Hartmut Kaiser, Zhou Lei and Jon

MacLaren, “GetData: A Grid Enabled Data Client for Coastal Modeling”, accepted

for HPC 2006.

[34] Unidata Local Data Manager. Unidata. February 20, 2006.

http://www.unidata.ucar.edu/software/ldm/index.html

Page 50: A Grid-enabled Science Portal for Collaborative Coastal ...cct.lsu.edu/~gallen/Students/Zhang_2006.pdf · members in the SCOOP team and Grid research group at the Center for Computation

BIBLIOGRAPHY 42

[35] P. Holub, M. Kuba, L. Matyska, and M. Ruda. “Grid Infrastructure Monitoring as Re-

liable Information Service”, in The 2nd European Across Grids Conference, Nicosia,

Cyprus, January 2004. To be published by Springer Verlag as a part of Lecture Notes

in Computer Science during summer 2004.

[36] Ann L. Chervenak, Naveen Palavalli, Shishir Bharathi, Carl Kesselman, Robert

Schwartzkopf. “Performance and Scalability of a Replica Location Service,” in

13th IEEE International Symposium on High Performance Distributed Computing

(HPDC-13 ’04), vol. 00, pp. 182-191, 2004.

[37] PURSe: Portal-Based User Registration Service. The GRIDS Center. September

2005. http://www.grids-center.org/solutions/purse/

[38] Bhatia, K., Lin, A., Link, B., Mueller, K. and Chandra, S. “Geon/Telescience Secu-

rity Infrastructure”. San Diego Supercomputer Center, Technical Report TR-2004-5,

2004.

[39] GAMA: Grid Account Management Architecture. San Diego Supercomputer Center.

December 2005. http://grid-devel.sdsc.edu/gama