Upload
datacenters
View
433
Download
0
Embed Size (px)
Citation preview
Page 1
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Center for Spatial Information Science and Systems (CSISS)George Mason University (GMU)
Product Virtualization in A Geospatial Grid
Liping Di, Aijun Chen, Wenli Yang, Yaxing Wei
Present at CEOS-WGISS on 09/12/2006
Page 2
CSISSCenter for Spatial Information Science and Systems
09/12/2006
The Geospatial Discipline
Geospatial discipline deals with collecting, archiving, managing, analyzing, and distributing spatially explicit or implicit data.
Any data related to events or phenomenon on Earth.
Large volumes of geospatial data available
The data are highly diverse
The data repositories are located around the world
The geospatial community has developed a set of standards for interoperating and sharing geospatial data, most notably from:
ISO TC 211
Open Geospatial Consortium (OGC)
Federal Geographic Data Community
Page 3
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Geospatial Grid
• Geospatial Grid is the application and extension of Grid technology to the Geospatial discipline.
• Some features a Geospatial Grid have to provide– Able to handle the specifics of geospatial data, e.g., data
formats, projections, multi-dimensionality, etc.– Use the geospatial standards and interfaces that the
geospatial community has be familiar with.• Ideally, users should be able to access the Grid-managed
geospatial resources using existing standard-compliant geospatial Web clients without either knowing a Grid is running or needing to modify the client (e.g., through the Web portal on top of the Grid).
Page 4
CSISSCenter for Spatial Information Science and Systems
09/12/2006
The GMU Geospatial Grid Project
• The GMU geospatial Grid project has been funded by NASA Advanced Information System Technology (AIST) program.
• The project has goals:– Making Grid technology geospatial enabled and OGC standards
compliant and making OGC technology Grid enabled.– Allowing researchers to focus on science and not issues with
computing, storage and bandwidth resources, as well as data receipt, data format and data set manipulation.
– Achieving the integration of NASA EOSDIS and ESG (Earth System Grid).
• The project has two major work items:– Enable to access real data in the Grid environment with OGC
protocols.– Study the approach for geospatial product virtualization in the Grid
environment and develop a prototype.• This presentation concentrates on the concept, approach,
technology, and prototype implementation of the product virtualization.
Page 5
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Standard products Approach
• Geospatial data collected by instruments are called raw data.– In order to make geospatial data useful, processing steps have to
be carried out to extract information from the data and then convert to knowledge for applications and decision supports.
• Standard products– Currently most data providers produce some advanced geospatial
products, called standard products, in advance with anticipation that such products can meet requirements of most users.
– Provide the standard products as is to the users—one-size-fits-all approach.
• Problems with standard-products only approach– users’ requirements are so different that a pre-produced product
cannot meet the requirements of many different users.– Wasting time and resources: some standard products are never
fully used.
Page 6
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Virtual Geospatial Products
• A virtual geospatial product is a geospatial product that not yet exists but the data system know how to create the product when a user requests it.
• Advantages of products virtualizations– The data centers only need to archives low-level data.– Significantly reduce the requirements for computing resources.– Provide more products to users than the non-virtual standard
product approach.
• Currently a lot of data providers are implementing the standard products as virtual products.– Provide the standard products on demand—not customizable.– Customizable virtual products– expert users definable, user-
modifiable.
Page 7
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Customizable Virtual Geospatial Products
• Allowing expert users to easily define a geospatial product and how to produce such a product conceptually (Geospatial processing model)
– The models are constructed with the type of processing functions instead of concrete processing functions.
– Models are applicable within specific spatial-temporal domain• Advertising the model as a type of geospatial products available at the
system. – Not specify as data granules
• Implemented at service environment– Individual processing functions in the models are service types
• Models are instantiated to become a specific workflow only when a user requests the products with specifications provided:
– Spatial, temporal, format, projection, spatial/temporal resolution etc.– Service type and data types in the models are replaced with the service and
data instances.• Advantages
– A system with such capability can have unlimited number of data- product types, with unlimited number of data granules unique to each users.
– Flexible, sharable, easily customizable.
Page 8
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Some Terminologies
• Geo-object – a geospatial product describing geospatial phenomena or events.– Why not called datasets???
• User geo-object – a geo-object defined and requested by a user.• Archive geo-object – a geo-object that exists in a geospatial
archive.• Geo-object type – abstract of individual geo-objects that share
same features (e.g., spatial, temporal, content..).• Geospatial process module – an process that produce a geo-
object/s through manipulation of input geo-objects.• Geospatial service – Geospatial process module implemented in
the service environment.• Geospatial service type – Abstract of individual geospatial
services that provide same type of geospatial process.
Page 9
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Modeling approach to product virtualization --GeoTree
• Geospatial process model—Tell conceptually steps to take to generate a geo-object type/s from a set of geo-object types.
• A geospatial process model represents the knowledge for generating a specific type of geo-object.
• Geo-tree – The conceptual/graphic representation of a geospatial process model.
• Two types of nodes in a geo-tree: process node and geo-object node.
• The process node is a geospatial service type.• The geo-object node contains a geo-object type.• By defining geospatial models at abstract level
– Allow the development of general model that can use for generating unlimited numbers of instances.
– Make domain experts easier to create models by eliminating the needs to details on product specifications.
Page 10
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Geo-object, Geo-tree, Virtual products, Geospatial Models
archived geo-object user geo-object Intermediate geo-object Automated data transformation
service(WCS/WFS)
no service data service modeling and virtual data services
User Requested
User Obtained
Geospatial web/Grid services
Page 11
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Virtual geo-objects and virtual geo-object type
• A virtual geo-object is a geo-object that:– not exist in a geospatial information system– The system knows how to create it on-demand.
• All metadata items for a virtual geo-object has defined values. – It uses the same metadata as the real geo-object (archive geo-
object).– The only thing different is that the data part does not exist yet.– The client/data user will not know the difference between a real and
a virtual geo-object.• Virtual geo-object type—abstract of virtual geo-objects that
share the same features.– All geo-object nodes in a geo-tree, except for those nodes for the
archive geo-objects, are virtual geo-object types;– The root node in a geo-tree is a virtual geo-object type the model
can generate.
Page 12
CSISSCenter for Spatial Information Science and Systems
09/12/2006
From Geospatial model to user geo-object
GeospatialModel
Virtualgeo-object
LogicalWorkflow
ConcreteWorkflow
Workflowexecution
user geo-object
Knowledge Capture phase
User query Phase User retrieval phase
Page 13
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Knowledge Capture: Creation of Geospatial Models by Users/Experts
• A user-requested geo-object maybe does not exist both virtually and no virtually.
• If the user knows the thought process to create the geo-object from lower-level inputs step-by-step (the logical geospatial modeling)
– With help of a good user interface and the availability of service modules and models/submodels, the user can construct a geospatial model/virtual data product interactively.
– The system then can produce the virtual data product for the user.– The user-created model can be incorporated into the system as a part of the
virtual datasets the system can provide.• This allows the system to grow capabilities with time. • Advantages
– allows users to obtain the ready-to-use scientific information instead of the raw data, significantly reducing the data traffic between the users and the geospatial Grid.
– allows users to explore huge resources available at a data Grid and to conduct tasks that they never be able to conduct before.
Page 14
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Knowledge Capture: From Geo-tree to virtual geo-object
• The root node of a geo-tree is a virtual geo-object type• When user/client requests a geo-object, user will provide a
description of the geo-object they want; • If the type of the user geo-object matches with virtual geo-
object type in a geo-tree, – The geo-tree is selected;– The root node is instantiated with the descriptions provided
by the clients.– The root node now becomes a virtual user geo-object.
• The next step is to determine if the virtual user-geo-object can be materialized on the fly.
Page 15
CSISSCenter for Spatial Information Science and Systems
09/12/2006
User query phase: Logical Instantiation of Geo-Tree
• Check if a virtual user geo-object can be materialized by instantiating the whole geo-tree.– Push the description of the virtual user geo-object down to each
node of geo-tree (e.g., spatial coverage, format, etc);– Discover instance of service and geo-objects through searching
both service instance catalog and geo-object instance catalog;– If an archive geo-object is found as the input of a process, then the
push down will be stop for the branch of this tree.• The logical instantiation will not create an actual workflow, but
conceptually, it creates a logical workflow.• If a geo-tree can be instantiated logically, the virtual user geo-
object can be materialized.– A logical ID will be created and return to client to indicate the user-
requested geo-object is found in the system. – The logical ID will be used by the client/user to request for the geo-
object.
Page 16
CSISSCenter for Spatial Information Science and Systems
09/12/2006
User retrieval phase: Physical Instantiation of Geo-Tree
• When client requests a user geo-object, the geo-object ID will indicate if the user geo-object is virtual.
• If the geo-object is virtual, the geo-tree associated with the virtual geo-object will be instantiated to create a concrete workflow.– A workflow language will be used to encode the workflow.– The workflow is executable in a workflow execution engine.
Page 17
CSISSCenter for Spatial Information Science and Systems
09/12/2006
User retrieval phase:Creation of user geo-object
• The workflow engine will execute the workflow and generate the user geo-object.
• The user geo-object will be return to user/client.• The above mentioned steps reflects two stage processes:
– User query– User retrieve
• The two stages can also be merged into one stage process that when a user query can meet, the resulted user-object will be pushed back to user automatically without user initiation of the retrieval.
Page 18
CSISSCenter for Spatial Information Science and Systems
09/12/2006
The Implementation: Testbed Environment – Virtual Organization
GMU (Solaris) (laits.gmu.edu)Globus 4.0.1 with GMU Certs.
GMU (Mac)(geobrain.laits.gmu.edu)Globus 4.0.1 with GMU
Certs.
GMU CA center
Ames ipg05 (Linux)(ipg05.ipg.nasa.gov)
Globus 4.0.1 with IPG Certs.GMU LAITS VONASA IPG VO
GMU (Linux)(data.laits.gmu.edu)
Globus 4.0.1 with GMU Certs.
IPG CA center
NASA SGT (Linux)(arao2.sgt-inc.com)Globus 3.2 with CEOS
Certs.
NASA (Linux)(former.intl-interfaces.net)Globus 3.0 with CEOS Certs.
CEOS VO
Authentication among different VO
LLNL esg2 (Linux)(esg2.llnl.gov)
Globus 4.0.1 with ESG Certs.
LLNL ESG VO
ESG CA center
Page 19
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Testbed Environment - Hardware
The testbed has been established with 5 machines in 3 organizations -- CSISS/GMU, Ames/NASA, and LLNL/DOE.
The flagship machine in the testbed is GMU’s Apple cluster server:– 6 Apple G5 server nodes- 3 with dual 2.5GHz CPU and 3 with dual 2.0 GHz
CPU with total of 12 GB RAM.– 22.6TB RAID storage.– 1GB network to Internet II and 100 MB to Internet I.– Hosted at ESDIS network lab of GSFC.
Page 20
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Testbed Environment - Software
Globus 4.0.1 were installed at all nodes. The geospatial Grid software developed by GMU has been
installed at all nodes. Setup CA, issue Certificates and establish the VO as testbed.
– Set up LAITS CA, issued LAITS certificates to Mac machine, Solaris machine, and Linux machine of GMU LAITS.
– Set up IPG CA, issued IPG certificates to Linux machine at NASA Ames.
– Used ESG CA, issued ESG certificate to Linux machine at DOE LLNL.
– Tested and debugged the authentication between any two different CAs’ certificates among all of the above boxes.
The prototype system for Grid-based geospatial processing modeling and Virtual Geospatial Products generation is set up at GMU.
Page 21
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Testbed Environment - Data
Populated the G5 server with:– Landsat data covering Globe for year 1975, 1990 and 2000. Currently total
15TB of data has been ingested.– Shuttle DEM data covering Globe for year 2000. The size of DEM is about
1TB– Other sample EOSDIS data (e.g., MODIS, Aster, etc).
Converted part of DOE LLNL netCDF modeling data from ESG to HDF-EOS format and loaded them into the LLNL node of the testbed.
Replicated some typical EOSDIS data at NASA Ames node.
The total size of data in the testbed is around 17 TB now.
Page 22
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Overall Architecture of the Geospatial Grid for both Real and Virtual Data Servings
Diagram of user request and data workflow
Globus Toolkit 4.0/4.0.1 with GSI
HDF-EOS DataOther
Data
NetCDFData
LAITS WCS Portal
CSW Portal
Client
V+
V+
+ default WCS/WMS portal IP
V+
V+
Other WCS
LAITS GridCSW
GCSF
GESGCS
LAITS WMS Portal
ECHOCatalog
V+
V+
V+Real data request
AmesGridWCS
Ames DTS
RLS
ROS
MDS
iGSM
LLNLGridWCS
LAITSGridWCS
V+
V+
V+
11
VGVWCS/Instantiator
VGWES
GridWICS
GridWCTS
V
11V
Grid TierReal datarequest
OGC Tier
User Tier
Page 23
CSISSCenter for Spatial Information Science and Systems
09/12/2006
CSW, WCS, WMS Portals
The portals allows the OGC compliant Web clients to access the resources managed by the Geospatial Grid without knowing the existence of the Grid.
To OGC clients, the portals are standard OGC servers, and to the Grid underneath the portals, the portals are authorized Grid users.
CSW portal for catalog
WCS portal for access to both virtual and real coverage data
WMS portal for access to both virtual and real map
Page 24
CSISSCenter for Spatial Information Science and Systems
09/12/2006
GCSW, GWMS, and GWCS
OGC Catalog Service for Web (CSW), Web Map Service (WMS), and Web Coverage Service (WCS) are fundamental services in geospatial data access.
Those services have been implemented as Grid services for manage and access geospatial data in Grid.They are the fundamental services in any geospatial Grid
The services were originally implemented as web services and for this project we ported them to Globus 4.0.1 environment as Grid services.
Page 25
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Implementation: Grid Geospatial Registries
• The schema presented here depends strongly on the type matches;
• Registries have to be created to register– Service type;– Geo-object type;– Service instances;– Geo-objects;
• A registry must be able to– Find instance based on search criteria.– Search instances based types;– Find types by giving an instance – Find the association between geo-object type and service type.
• Metadata standards are needed to describe both type and instances of geo-objects & services.
• OGC Catalog Service for Web (CSW) are used to construct the geospatial registry.
Page 26
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Geospatial Metadata Standards
• We are using ISO 19115 + ISO 19115.2 to describe the geo-objects.– ISO 19115 is an international standard for geospatial
metadata.– ISO 19115.2 is an imagery extension of ISO 19115 to
provide more metadata elements for describing geospatial imagery.
– Not all elements are used for our project. Only mandatory elements + those needed for type matches are used. (about 50 elements).
• There is no standard for describing algorithms used in geospatial services.– It is important to have an accurate description of individual
algorithms used for geospatial services in order for users to create geospatial models.
Page 27
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Grid Catalog Federation - GCSF
GCSF harmonizes user query among GMU CSW, ESG CS and ECHO.
– Talked with NASA ECHO through GMU GCSF.
Page 28
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Software Development for Integration - iGSM
WCS PortalWMS Portal
GCSWGWCS
GWMS
iGSM
ROS MDS
DTS
intelligent Grid Service Mediator (iGSM) supports WCS portal and WMS portal to distribute their request to proper GWCS and GWMS.
GCSF
Page 29
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Geo-Tree Instantiation Service (Instantiator)
• A Instantiator has been developed to work with the CSW services for converting GeoTree to a physical workflow based on users’ specifications to the requested products.
• The instantiator works as a Grid service.• It has two operational models:
– Logical instantiation-to check if a user’s request for a product can be met by a virtual product.
• Work during the catalog search phase to ensure the virtual product can be materialized if the user does request such a product.
• Without actually create a physical workflow– Physical instantiation
• Work in the data retrieval phase – the user actually retrieves the product which happens to be a virtual one.
• Create an executable workflow in BPEL.
Page 30
CSISSCenter for Spatial Information Science and Systems
09/12/2006
The Grid Workflow Execution Service (GWES)– The Grid-enabled BPELPower
BPEL engine architecture – Execute Grid Services with Standard BPEL workflows.
ActivitiesBPEL Processes InstancesWSDL Services
BPEL Process ManagerBPELPower
Instantiation
Logic ProcessModel
Deployment
As a serveror middleware
As a service
Browser-oriented clients
Service-oriented clients
Page 31
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Replica and Optimization Service (ROS) works with Globus RLS and other services to find best resources in VO.
Globus RLS as Grid Service
Replica and Optimization Service (ROS)
Globus Index service
Globus MDS scripts modification
LRC (Laits) LRC (Laits-data) LRC (Ames/LLNL)
RLI (Laits) RLI (Laits-data)
ROS
RLS Index Service (MDS)
Page 32
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Grid-enabled Geo-Services (Instances)
• The availability of a large amount of service modules (i.e., the service instance) is one of the keys for a powerful customizable virtual product system.
• Two types of services – Geo-access and geoprocessing• Geo-access/management:
– GCSW– GWCS– GWMS.
• We didn’t develop another fundamental geo-access service: the Web Feature Service (WFS) since the project only deals with coverage data.
Page 33
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Grid-enabled Geoprocessing Services
• Geoprocessing services comes from two ways:– Developed from scratch or converted from existing geoprocessing
software package• We have developed the following Grid-enabled Geoprocessing
services– GWICS (Grid-enabled WICS)– GWCTS (Grid-enabled WCTS)– GridSlope– GridAspect– GridCalifornia_WHR3_Classification– GridNDVI– GridLandslide_Susceptibility_2i– GridLandslide_Susceptibility_4i
• GRASS is a public domain standalone GIS and image processing package with about 230 geospatial process modules.– We have wrapped the modules with service interfaces (SOAP) and
create WSDL description of those services.– Those modules are ported to Globus 4.0.1 to become Grid services
• All services are registered at GCSW
Page 34
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Modeling Virtual Products
Graphic User interface for users to model VDP with support of Ontologies.
Page 35
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Registry of Abstract Model of VDP
Registration of Abstract Model to GridCSW
Page 36
CSISSCenter for Spatial Information Science and Systems
09/12/2006
Conclusions
• This presentation discussed the concepts and approach for virtualization of customizable geospatial products in a Geospatial Grid. – Current technologies, interoperability standards, and
network infrastructure allow building a distributed service-oriented geospatial virtual product system.
– Systems built on such a schema are more flexible and scalable and can provide much better services to the user community than traditional non-service based systems.
• The Grid-based prototype virtual product system has demonstrated the concept and approach discussed in this presentation are feasible– The prototype system has successfully simulated the
operational environment of large geospatial data repositories.