36
Page 1 CSISS Center for Spatial Information Science and Systems 09/12/2006 Center for Spatial Information Science and Systems (CSISS) George Mason University (GMU) Product Virtualization in A Geospatial Grid Liping Di, Aijun Chen, Wenli Yang, Yaxing Wei [email protected] Present at CEOS-WGISS on 09/12/2006

Implementation of Product Virtualization in a Geospatial Grid

Embed Size (px)

Citation preview

Page 1: Implementation of Product Virtualization in a Geospatial Grid

Page 1

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Center for Spatial Information Science and Systems (CSISS)George Mason University (GMU)

Product Virtualization in A Geospatial Grid

Liping Di, Aijun Chen, Wenli Yang, Yaxing Wei

[email protected]

Present at CEOS-WGISS on 09/12/2006

Page 2: Implementation of Product Virtualization in a Geospatial Grid

Page 2

CSISSCenter for Spatial Information Science and Systems

09/12/2006

The Geospatial Discipline

Geospatial discipline deals with collecting, archiving, managing, analyzing, and distributing spatially explicit or implicit data.

Any data related to events or phenomenon on Earth.

Large volumes of geospatial data available

The data are highly diverse

The data repositories are located around the world

The geospatial community has developed a set of standards for interoperating and sharing geospatial data, most notably from:

ISO TC 211

Open Geospatial Consortium (OGC)

Federal Geographic Data Community

Page 3: Implementation of Product Virtualization in a Geospatial Grid

Page 3

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Geospatial Grid

• Geospatial Grid is the application and extension of Grid technology to the Geospatial discipline.

• Some features a Geospatial Grid have to provide– Able to handle the specifics of geospatial data, e.g., data

formats, projections, multi-dimensionality, etc.– Use the geospatial standards and interfaces that the

geospatial community has be familiar with.• Ideally, users should be able to access the Grid-managed

geospatial resources using existing standard-compliant geospatial Web clients without either knowing a Grid is running or needing to modify the client (e.g., through the Web portal on top of the Grid).

Page 4: Implementation of Product Virtualization in a Geospatial Grid

Page 4

CSISSCenter for Spatial Information Science and Systems

09/12/2006

The GMU Geospatial Grid Project

• The GMU geospatial Grid project has been funded by NASA Advanced Information System Technology (AIST) program.

• The project has goals:– Making Grid technology geospatial enabled and OGC standards

compliant and making OGC technology Grid enabled.– Allowing researchers to focus on science and not issues with

computing, storage and bandwidth resources, as well as data receipt, data format and data set manipulation.

– Achieving the integration of NASA EOSDIS and ESG (Earth System Grid).

• The project has two major work items:– Enable to access real data in the Grid environment with OGC

protocols.– Study the approach for geospatial product virtualization in the Grid

environment and develop a prototype.• This presentation concentrates on the concept, approach,

technology, and prototype implementation of the product virtualization.

Page 5: Implementation of Product Virtualization in a Geospatial Grid

Page 5

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Standard products Approach

• Geospatial data collected by instruments are called raw data.– In order to make geospatial data useful, processing steps have to

be carried out to extract information from the data and then convert to knowledge for applications and decision supports.

• Standard products– Currently most data providers produce some advanced geospatial

products, called standard products, in advance with anticipation that such products can meet requirements of most users.

– Provide the standard products as is to the users—one-size-fits-all approach.

• Problems with standard-products only approach– users’ requirements are so different that a pre-produced product

cannot meet the requirements of many different users.– Wasting time and resources: some standard products are never

fully used.

Page 6: Implementation of Product Virtualization in a Geospatial Grid

Page 6

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Virtual Geospatial Products

• A virtual geospatial product is a geospatial product that not yet exists but the data system know how to create the product when a user requests it.

• Advantages of products virtualizations– The data centers only need to archives low-level data.– Significantly reduce the requirements for computing resources.– Provide more products to users than the non-virtual standard

product approach.

• Currently a lot of data providers are implementing the standard products as virtual products.– Provide the standard products on demand—not customizable.– Customizable virtual products– expert users definable, user-

modifiable.

Page 7: Implementation of Product Virtualization in a Geospatial Grid

Page 7

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Customizable Virtual Geospatial Products

• Allowing expert users to easily define a geospatial product and how to produce such a product conceptually (Geospatial processing model)

– The models are constructed with the type of processing functions instead of concrete processing functions.

– Models are applicable within specific spatial-temporal domain• Advertising the model as a type of geospatial products available at the

system. – Not specify as data granules

• Implemented at service environment– Individual processing functions in the models are service types

• Models are instantiated to become a specific workflow only when a user requests the products with specifications provided:

– Spatial, temporal, format, projection, spatial/temporal resolution etc.– Service type and data types in the models are replaced with the service and

data instances.• Advantages

– A system with such capability can have unlimited number of data- product types, with unlimited number of data granules unique to each users.

– Flexible, sharable, easily customizable.

Page 8: Implementation of Product Virtualization in a Geospatial Grid

Page 8

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Some Terminologies

• Geo-object – a geospatial product describing geospatial phenomena or events.– Why not called datasets???

• User geo-object – a geo-object defined and requested by a user.• Archive geo-object – a geo-object that exists in a geospatial

archive.• Geo-object type – abstract of individual geo-objects that share

same features (e.g., spatial, temporal, content..).• Geospatial process module – an process that produce a geo-

object/s through manipulation of input geo-objects.• Geospatial service – Geospatial process module implemented in

the service environment.• Geospatial service type – Abstract of individual geospatial

services that provide same type of geospatial process.

Page 9: Implementation of Product Virtualization in a Geospatial Grid

Page 9

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Modeling approach to product virtualization --GeoTree

• Geospatial process model—Tell conceptually steps to take to generate a geo-object type/s from a set of geo-object types.

• A geospatial process model represents the knowledge for generating a specific type of geo-object.

• Geo-tree – The conceptual/graphic representation of a geospatial process model.

• Two types of nodes in a geo-tree: process node and geo-object node.

• The process node is a geospatial service type.• The geo-object node contains a geo-object type.• By defining geospatial models at abstract level

– Allow the development of general model that can use for generating unlimited numbers of instances.

– Make domain experts easier to create models by eliminating the needs to details on product specifications.

Page 10: Implementation of Product Virtualization in a Geospatial Grid

Page 10

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Geo-object, Geo-tree, Virtual products, Geospatial Models

archived geo-object user geo-object Intermediate geo-object Automated data transformation

service(WCS/WFS)

no service data service modeling and virtual data services

User Requested

User Obtained

Geospatial web/Grid services

Page 11: Implementation of Product Virtualization in a Geospatial Grid

Page 11

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Virtual geo-objects and virtual geo-object type

• A virtual geo-object is a geo-object that:– not exist in a geospatial information system– The system knows how to create it on-demand.

• All metadata items for a virtual geo-object has defined values. – It uses the same metadata as the real geo-object (archive geo-

object).– The only thing different is that the data part does not exist yet.– The client/data user will not know the difference between a real and

a virtual geo-object.• Virtual geo-object type—abstract of virtual geo-objects that

share the same features.– All geo-object nodes in a geo-tree, except for those nodes for the

archive geo-objects, are virtual geo-object types;– The root node in a geo-tree is a virtual geo-object type the model

can generate.

Page 12: Implementation of Product Virtualization in a Geospatial Grid

Page 12

CSISSCenter for Spatial Information Science and Systems

09/12/2006

From Geospatial model to user geo-object

GeospatialModel

Virtualgeo-object

LogicalWorkflow

ConcreteWorkflow

Workflowexecution

user geo-object

Knowledge Capture phase

User query Phase User retrieval phase

Page 13: Implementation of Product Virtualization in a Geospatial Grid

Page 13

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Knowledge Capture: Creation of Geospatial Models by Users/Experts

• A user-requested geo-object maybe does not exist both virtually and no virtually.

• If the user knows the thought process to create the geo-object from lower-level inputs step-by-step (the logical geospatial modeling)

– With help of a good user interface and the availability of service modules and models/submodels, the user can construct a geospatial model/virtual data product interactively.

– The system then can produce the virtual data product for the user.– The user-created model can be incorporated into the system as a part of the

virtual datasets the system can provide.• This allows the system to grow capabilities with time. • Advantages

– allows users to obtain the ready-to-use scientific information instead of the raw data, significantly reducing the data traffic between the users and the geospatial Grid.

– allows users to explore huge resources available at a data Grid and to conduct tasks that they never be able to conduct before.

Page 14: Implementation of Product Virtualization in a Geospatial Grid

Page 14

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Knowledge Capture: From Geo-tree to virtual geo-object

• The root node of a geo-tree is a virtual geo-object type• When user/client requests a geo-object, user will provide a

description of the geo-object they want; • If the type of the user geo-object matches with virtual geo-

object type in a geo-tree, – The geo-tree is selected;– The root node is instantiated with the descriptions provided

by the clients.– The root node now becomes a virtual user geo-object.

• The next step is to determine if the virtual user-geo-object can be materialized on the fly.

Page 15: Implementation of Product Virtualization in a Geospatial Grid

Page 15

CSISSCenter for Spatial Information Science and Systems

09/12/2006

User query phase: Logical Instantiation of Geo-Tree

• Check if a virtual user geo-object can be materialized by instantiating the whole geo-tree.– Push the description of the virtual user geo-object down to each

node of geo-tree (e.g., spatial coverage, format, etc);– Discover instance of service and geo-objects through searching

both service instance catalog and geo-object instance catalog;– If an archive geo-object is found as the input of a process, then the

push down will be stop for the branch of this tree.• The logical instantiation will not create an actual workflow, but

conceptually, it creates a logical workflow.• If a geo-tree can be instantiated logically, the virtual user geo-

object can be materialized.– A logical ID will be created and return to client to indicate the user-

requested geo-object is found in the system. – The logical ID will be used by the client/user to request for the geo-

object.

Page 16: Implementation of Product Virtualization in a Geospatial Grid

Page 16

CSISSCenter for Spatial Information Science and Systems

09/12/2006

User retrieval phase: Physical Instantiation of Geo-Tree

• When client requests a user geo-object, the geo-object ID will indicate if the user geo-object is virtual.

• If the geo-object is virtual, the geo-tree associated with the virtual geo-object will be instantiated to create a concrete workflow.– A workflow language will be used to encode the workflow.– The workflow is executable in a workflow execution engine.

Page 17: Implementation of Product Virtualization in a Geospatial Grid

Page 17

CSISSCenter for Spatial Information Science and Systems

09/12/2006

User retrieval phase:Creation of user geo-object

• The workflow engine will execute the workflow and generate the user geo-object.

• The user geo-object will be return to user/client.• The above mentioned steps reflects two stage processes:

– User query– User retrieve

• The two stages can also be merged into one stage process that when a user query can meet, the resulted user-object will be pushed back to user automatically without user initiation of the retrieval.

Page 18: Implementation of Product Virtualization in a Geospatial Grid

Page 18

CSISSCenter for Spatial Information Science and Systems

09/12/2006

The Implementation: Testbed Environment – Virtual Organization

GMU (Solaris) (laits.gmu.edu)Globus 4.0.1 with GMU Certs.

GMU (Mac)(geobrain.laits.gmu.edu)Globus 4.0.1 with GMU

Certs.

GMU CA center

Ames ipg05 (Linux)(ipg05.ipg.nasa.gov)

Globus 4.0.1 with IPG Certs.GMU LAITS VONASA IPG VO

GMU (Linux)(data.laits.gmu.edu)

Globus 4.0.1 with GMU Certs.

IPG CA center

NASA SGT (Linux)(arao2.sgt-inc.com)Globus 3.2 with CEOS

Certs.

NASA (Linux)(former.intl-interfaces.net)Globus 3.0 with CEOS Certs.

CEOS VO

Authentication among different VO

LLNL esg2 (Linux)(esg2.llnl.gov)

Globus 4.0.1 with ESG Certs.

LLNL ESG VO

ESG CA center

Page 19: Implementation of Product Virtualization in a Geospatial Grid

Page 19

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Testbed Environment - Hardware

The testbed has been established with 5 machines in 3 organizations -- CSISS/GMU, Ames/NASA, and LLNL/DOE.

The flagship machine in the testbed is GMU’s Apple cluster server:– 6 Apple G5 server nodes- 3 with dual 2.5GHz CPU and 3 with dual 2.0 GHz

CPU with total of 12 GB RAM.– 22.6TB RAID storage.– 1GB network to Internet II and 100 MB to Internet I.– Hosted at ESDIS network lab of GSFC.

Page 20: Implementation of Product Virtualization in a Geospatial Grid

Page 20

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Testbed Environment - Software

Globus 4.0.1 were installed at all nodes. The geospatial Grid software developed by GMU has been

installed at all nodes. Setup CA, issue Certificates and establish the VO as testbed.

– Set up LAITS CA, issued LAITS certificates to Mac machine, Solaris machine, and Linux machine of GMU LAITS.

– Set up IPG CA, issued IPG certificates to Linux machine at NASA Ames.

– Used ESG CA, issued ESG certificate to Linux machine at DOE LLNL.

– Tested and debugged the authentication between any two different CAs’ certificates among all of the above boxes.

The prototype system for Grid-based geospatial processing modeling and Virtual Geospatial Products generation is set up at GMU.

Page 21: Implementation of Product Virtualization in a Geospatial Grid

Page 21

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Testbed Environment - Data

Populated the G5 server with:– Landsat data covering Globe for year 1975, 1990 and 2000. Currently total

15TB of data has been ingested.– Shuttle DEM data covering Globe for year 2000. The size of DEM is about

1TB– Other sample EOSDIS data (e.g., MODIS, Aster, etc).

Converted part of DOE LLNL netCDF modeling data from ESG to HDF-EOS format and loaded them into the LLNL node of the testbed.

Replicated some typical EOSDIS data at NASA Ames node.

The total size of data in the testbed is around 17 TB now.

Page 22: Implementation of Product Virtualization in a Geospatial Grid

Page 22

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Overall Architecture of the Geospatial Grid for both Real and Virtual Data Servings

Diagram of user request and data workflow

Globus Toolkit 4.0/4.0.1 with GSI

HDF-EOS DataOther

Data

NetCDFData

LAITS WCS Portal

CSW Portal

Client

V+

V+

+ default WCS/WMS portal IP

V+

V+

Other WCS

LAITS GridCSW

GCSF

GESGCS

LAITS WMS Portal

ECHOCatalog

V+

V+

V+Real data request

AmesGridWCS

Ames DTS

RLS

ROS

MDS

iGSM

LLNLGridWCS

LAITSGridWCS

V+

V+

V+

11

VGVWCS/Instantiator

VGWES

GridWICS

GridWCTS

V

11V

Grid TierReal datarequest

OGC Tier

User Tier

Page 23: Implementation of Product Virtualization in a Geospatial Grid

Page 23

CSISSCenter for Spatial Information Science and Systems

09/12/2006

CSW, WCS, WMS Portals

The portals allows the OGC compliant Web clients to access the resources managed by the Geospatial Grid without knowing the existence of the Grid.

To OGC clients, the portals are standard OGC servers, and to the Grid underneath the portals, the portals are authorized Grid users.

CSW portal for catalog

WCS portal for access to both virtual and real coverage data

WMS portal for access to both virtual and real map

Page 24: Implementation of Product Virtualization in a Geospatial Grid

Page 24

CSISSCenter for Spatial Information Science and Systems

09/12/2006

GCSW, GWMS, and GWCS

OGC Catalog Service for Web (CSW), Web Map Service (WMS), and Web Coverage Service (WCS) are fundamental services in geospatial data access.

Those services have been implemented as Grid services for manage and access geospatial data in Grid.They are the fundamental services in any geospatial Grid

The services were originally implemented as web services and for this project we ported them to Globus 4.0.1 environment as Grid services.

Page 25: Implementation of Product Virtualization in a Geospatial Grid

Page 25

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Implementation: Grid Geospatial Registries

• The schema presented here depends strongly on the type matches;

• Registries have to be created to register– Service type;– Geo-object type;– Service instances;– Geo-objects;

• A registry must be able to– Find instance based on search criteria.– Search instances based types;– Find types by giving an instance – Find the association between geo-object type and service type.

• Metadata standards are needed to describe both type and instances of geo-objects & services.

• OGC Catalog Service for Web (CSW) are used to construct the geospatial registry.

Page 26: Implementation of Product Virtualization in a Geospatial Grid

Page 26

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Geospatial Metadata Standards

• We are using ISO 19115 + ISO 19115.2 to describe the geo-objects.– ISO 19115 is an international standard for geospatial

metadata.– ISO 19115.2 is an imagery extension of ISO 19115 to

provide more metadata elements for describing geospatial imagery.

– Not all elements are used for our project. Only mandatory elements + those needed for type matches are used. (about 50 elements).

• There is no standard for describing algorithms used in geospatial services.– It is important to have an accurate description of individual

algorithms used for geospatial services in order for users to create geospatial models.

Page 27: Implementation of Product Virtualization in a Geospatial Grid

Page 27

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Grid Catalog Federation - GCSF

GCSF harmonizes user query among GMU CSW, ESG CS and ECHO.

– Talked with NASA ECHO through GMU GCSF.

Page 28: Implementation of Product Virtualization in a Geospatial Grid

Page 28

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Software Development for Integration - iGSM

WCS PortalWMS Portal

GCSWGWCS

GWMS

iGSM

ROS MDS

DTS

intelligent Grid Service Mediator (iGSM) supports WCS portal and WMS portal to distribute their request to proper GWCS and GWMS.

GCSF

Page 29: Implementation of Product Virtualization in a Geospatial Grid

Page 29

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Geo-Tree Instantiation Service (Instantiator)

• A Instantiator has been developed to work with the CSW services for converting GeoTree to a physical workflow based on users’ specifications to the requested products.

• The instantiator works as a Grid service.• It has two operational models:

– Logical instantiation-to check if a user’s request for a product can be met by a virtual product.

• Work during the catalog search phase to ensure the virtual product can be materialized if the user does request such a product.

• Without actually create a physical workflow– Physical instantiation

• Work in the data retrieval phase – the user actually retrieves the product which happens to be a virtual one.

• Create an executable workflow in BPEL.

Page 30: Implementation of Product Virtualization in a Geospatial Grid

Page 30

CSISSCenter for Spatial Information Science and Systems

09/12/2006

The Grid Workflow Execution Service (GWES)– The Grid-enabled BPELPower

BPEL engine architecture – Execute Grid Services with Standard BPEL workflows.

ActivitiesBPEL Processes InstancesWSDL Services

BPEL Process ManagerBPELPower

Instantiation

Logic ProcessModel

Deployment

As a serveror middleware

As a service

Browser-oriented clients

Service-oriented clients

Page 31: Implementation of Product Virtualization in a Geospatial Grid

Page 31

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Replica and Optimization Service (ROS) works with Globus RLS and other services to find best resources in VO.

Globus RLS as Grid Service

Replica and Optimization Service (ROS)

Globus Index service

Globus MDS scripts modification

LRC (Laits) LRC (Laits-data) LRC (Ames/LLNL)

RLI (Laits) RLI (Laits-data)

ROS

RLS Index Service (MDS)

Page 32: Implementation of Product Virtualization in a Geospatial Grid

Page 32

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Grid-enabled Geo-Services (Instances)

• The availability of a large amount of service modules (i.e., the service instance) is one of the keys for a powerful customizable virtual product system.

• Two types of services – Geo-access and geoprocessing• Geo-access/management:

– GCSW– GWCS– GWMS.

• We didn’t develop another fundamental geo-access service: the Web Feature Service (WFS) since the project only deals with coverage data.

Page 33: Implementation of Product Virtualization in a Geospatial Grid

Page 33

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Grid-enabled Geoprocessing Services

• Geoprocessing services comes from two ways:– Developed from scratch or converted from existing geoprocessing

software package• We have developed the following Grid-enabled Geoprocessing

services– GWICS (Grid-enabled WICS)– GWCTS (Grid-enabled WCTS)– GridSlope– GridAspect– GridCalifornia_WHR3_Classification– GridNDVI– GridLandslide_Susceptibility_2i– GridLandslide_Susceptibility_4i

• GRASS is a public domain standalone GIS and image processing package with about 230 geospatial process modules.– We have wrapped the modules with service interfaces (SOAP) and

create WSDL description of those services.– Those modules are ported to Globus 4.0.1 to become Grid services

• All services are registered at GCSW

Page 34: Implementation of Product Virtualization in a Geospatial Grid

Page 34

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Modeling Virtual Products

Graphic User interface for users to model VDP with support of Ontologies.

Page 35: Implementation of Product Virtualization in a Geospatial Grid

Page 35

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Registry of Abstract Model of VDP

Registration of Abstract Model to GridCSW

Page 36: Implementation of Product Virtualization in a Geospatial Grid

Page 36

CSISSCenter for Spatial Information Science and Systems

09/12/2006

Conclusions

• This presentation discussed the concepts and approach for virtualization of customizable geospatial products in a Geospatial Grid. – Current technologies, interoperability standards, and

network infrastructure allow building a distributed service-oriented geospatial virtual product system.

– Systems built on such a schema are more flexible and scalable and can provide much better services to the user community than traditional non-service based systems.

• The Grid-based prototype virtual product system has demonstrated the concept and approach discussed in this presentation are feasible– The prototype system has successfully simulated the

operational environment of large geospatial data repositories.