52
1 Information Systems and Grid Services for Earthquake Forecasting Marlon Pierce Community Grids Lab Indiana University May 4, 2005

1 Integrating Geographical Information Systems and Grid Services for Earthquake Forecasting Marlon Pierce Community Grids Lab Indiana University May 4,

Embed Size (px)

Citation preview

11

Integrating Geographical Information Systems and

Grid Services for Earthquake Forecasting

Marlon PierceCommunity Grids Lab

Indiana UniversityMay 4, 2005

22

A Big Picture for Crisis Grid

Figure 1: Science, Critical Infrastructure Protection (CIP) and Emergency Preparedness and Response (EPR) Grids built as a Grid of Web Service (WS) Grids

EarthquakeResponse WS

Physical Network

Registry Metadata

EarthquakeForecast WS

Science Grid EPR/CIP Grid… TsunamiEPR/CIP …

Data Access/Storage

Security WorkflowNotification Messaging

Portals Visualization GridCollaboration Grid

Sensor/Database Grid Compute GridGIS Grid

Core Grid Services

33

The Problem: Integrating Data, Applications, and Client Devices

The key issue we try to solve is building the distributed computing infrastructure that can connect• Legacy data archives• Executable codes• Real time data sources• Collaboration services (http://www.globalmmcs.org)• Client tools for collaboration

Audio/Video systems, whiteboard annotators, etc Various application-specific grids can be built out of the common

infrastructure• Science Grids (described here)• Emergency planning, crisis response

We choose certain fixed points for our foundations• Web Service standards: SOAP and WSDL• Other standards where available: GIS standards• Universal messaging substrate for SOAP and other messages:

http://www.naradabrokering

44

Pattern Informatics (PI) PI is a technique developed at University of California, Davis for

analyzing earthquake seismic records to forecast regions with high future seismic activity.• They have correctly forecasted the locations of 15 of last 16 earthquakes

with magnitude > 5.0 in California.

See Tiampo, K. F., Rundle, J. B., McGinnis, S. A., & Klein, W. Pattern dynamics and forecast methods in seismically active regions. Pure Ap. Geophys. 159, 2429-2467 (2002). • http://citebase.eprints.org/cgi-bin/fulltext?format=application/

pdf&identifier=oai%3AarXiv.org%3Acond-mat%2F0102032

PI is being applied other regions of the world, and John has gotten a lot of press.• Google “John Rundle UC Davis Pattern Informatics”

55

Pattern Informatics in a Grid Environment

PI in a Grid environment:• Hotspot forecasts are made using publicly available seismic records.

Southern California Earthquake Data Center Advanced National Seismic System (ANSS) catalogs

• Code location is unimportant, can be a service through remote execution• Results need to be stored, shared, modified• Grid/Web Services can provide these capabilities

Problems:• How do we provide programming interfaces (not just user interfaces) to

the above catalogs?• How do we connect remote data sources directly to the PI code.• How do we automate this for the entire planet?

Solutions:• Use GIS services to provide the input data, plot the output data• Use HPSearch tool to tie together and manage the distributed data

sources and code.

66

CGL Work on GIS Services Some example OGC services include

• Web Feature Service (WFS): for retrieving GML encode features, like faults, roads, county boundaries, GPS station locations,….

• Web Map Service (WMS): for creating maps out of Web Features Problems with current GIS services

• Not (yet) Web Service compliant Efforts underway to provide this within OGC. But current specs are “pre” web service, no SOAP or WSDL Use instead HTTP GET/POST conventions.

• Often define general Web Service services as specialized standards Information services Notification services in sensor grids

• Can’t use other Web Service standards for reliability, security, etc. CGL is developing Web Service versions of OGC standard

services

77

Tying It All Together: HPSearch HPSearch is an engine for orchestrating distributed

Web Service interactions• It uses an event system and supports both file transfers and

data streams.• Legacy name

HPSearch flows can be scripted with JavaScript• HPSearch engine binds the flow to a particular set of remote

services and executes the script. HPSearch engines are Web Services, can be distributed

interoperate for load balancing.• Boss/Worker model

ProxyWebService: a wrapper class that adds notification and streaming support to a Web Service.

Data Filter(Danube)

PI Code Runner(Danube) Accumulate Data Run PI Code Create Graph Convert RAW -> GML

WFS(Gridfarm001)

WMS

HPSearch(TRex)

HPSearch(Danube)

HPSearch hosts an AXIS service for remote deployment of scripts

GML(Danube)

WS Context(Tambora)

NaradaBroker network: Used by HPSearch engines as well as for data transfer

Actual Data flow

HPSearch controls the Web services

Final Output pulled by the WMS

HPSearch Engines communicate using NB Messaging infrastructure

Virtual Data flow

Data can be stored and retrieved from the 3rd part repository (Context Service)

WMS submits script execution request (URI of script, parameters)

99

1010

1111

1212

Some Challenges Performance: Are GIS services suitable for non-trivial data

transfers?• Entire California seismic record since 1932 is 12 MB.• Global records obviously larger• This is not really suitable for HTTP downloads.• We are currently pursuing streaming data transfers for higher

performance Adoption: We must get the tools and services to the point where

science application developers want to use them early in the development process rather than later.• Web Service client libraries to remote GIS data• Develop codes to work with data streams rather than files.

Security: A global version of this has interesting security requirements

Authentication, authorization, federation for different countries Time/event dependent security for crisis response

1313

More Information Contact: [email protected] GIS Work at CGL: www.crisisgrid.org HPSearch at CGL: www.hpsearch.org SERVOGrid Web Sites

• Our fine parent project

• http://servo.jpl.nasa.gov/

• http://quakesim.jpl.nasa.gov/

1414

Status and Software Web Feature Service 1.x software available now

• www.crisisgrid.org Our SERVO WFS includes

• Fault data• GPS records• Seismic records now for most areas of the globe• Note these are Web Services, so you can build your own clients to connect

to our running services. Web Map Service

• Client portlets (shown) and services to be public soon.• Software downloads available soon.

HPSearch• Currently available, www.hpsearch.org

WS-Context and Information Services Work• http://grids.ucs.indiana.edu/~maktas/fthpis/

1515

Acknowledgements Prof. John Rundle and his group at UC-Davis,

particularly James Holliday. Prof. Geoffrey Fox, director of the Community Grids

Lab, and CGL grad students• Ahmet Sayar: Web Map Server development

• Galip Aydin: Web Feature Service development

• Mehmet Aktas: Information and Context Services

• Harshawardhan Gadgil: HPSearch Workflow development Satellite images from NASA OnEarth WMS This work is supported by the NASA Advanced

Information Systems Technology Program.

1616

Back up slides

1717

A Screen Shot From the WMS Client

1818

When you select (i) and click on a feature in the map

1919

SERVO Apps and Their Data Pattern Informatics: using seismic records, forecast “hotspot” areas of Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic

half-space.• Relies upon geometric fault models.

GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • Relies upon fault models with geometric and material properties.

Virtual California: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space.• Relies upon fault and fault friction models.

RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. • Used to analyze GPS and seismic catalogs.

2020

Where Is the Data? QuakeTables Fault Database

• SERVO’s fault repository for California.

• Compatible with GeoFEST, Disloc, and VirtualCalifornia

• http://infogroup.usc.edu:8080/public.html

GPS Data sources and formats (RDAHMM and others).• JPL: ftp://sideshow.jpl.nasa.gov/pub/mbh

• SOPAC: ftp://garner.ucsd.edu/pub/timeseries

• USGS: http://pasadena.wr.usgs.gov/scign/Analysis/plotdata/

Seismic Event Data (RDAHMM and others)• SCSN: http://www.scec.org/ftp/catalogs/SCSN

• SCEDC: http://www.scecd.scec.org/ftp/catalogs/SCEC_DC

• Dinger-Shearer: http://www.scecdc.org/ftp/catalogs/dinger-shearer/dinger-shearer.catalog

• Haukkson: http://www.scecdc.scec.org/ftp/catalogs/hauksson/Socal

2121

What Are Web Services? Web Services framework is a way for

doing distributed computing with XML.

• WSDL: Defines interfaces to functions of remote components. This is the programming interface.

• SOAP: Defines the message format that you exchange between components. This carries the messages over the network.

• Web Services are not web pages, CGI, servlets, applets.

XML provides cross-language support Suitable for both human and

application clients. We built SERVOGrid’s execution grid

services this way. We are building data grid components

in the same fashion.• Allows us to build general purpose

client environments like Web portals and command line tools.

WebServer

DB

JDBC

Browser

WebServer

SOAP

Appl

SOAPWSDL

WSDL

WSD

LWSD

L

2222

User Interface Server

DB Service 1

JDBC

DB

Job Sub/Mon And FileServices

Operating andQueuing Systems

WSDLWSDL

Browser Interface

WSDL

WSDL

WSDLWSDL WSDL

Viz Service

WSDL

Host 1 Host 2 Host 3

IDLGMT

SOAPSOAP

HTTP(S)

2323

GIS for Data Grids Geographic Information Systems

• ESRI: commercial company with many popular GIS products.

• Open Geospatial Consortium (formerly OpenGIS Consortium).

• We will focus on OGC since they define open and interoperable standards.

What are the characteristics of a GIS system?• Need data models to represent information

• Need services for remotely accessing data.

• Need metadata for determining what is stored in the services.

2424

GML: A Data Model For GIS GML 3.x is a interconnected suite of over 20 connected

XML schemas. GML is an abstract model for geography. With GML, you can encode

• Features: abstract representations of map entities.

• Geometry: encode abstractly how to represent a feature pictorially.

• Coordinate reference systems

• Topology

• Time, units of measure

• Observation data.

2525

Using GML to Encode GPS Collected GPS data for each site is

made available through online catalogs.• Using various text formats.

We have developed GML descriptions of GPS using Feature.xsd schema.

Full GML GPS schema: http://www.crisisgrid.org/files/gps.xsd.

Supports JPL, SOPAC, and USGS formats.

Other efforts: Feature encodings of Seismic and Fault catalogs.• http://www.crisisgrid.org/html/

servo_-_gps.html

2626

Anatomy of Web Feature Service GML provides the data model, but we must still provide a remote

programming interface to the data. Web Feature Service provides three major services as described in OGC

specification:• GetCapabilities: The clients (WMS servers or users) starts with requesting a

document from WFS which describes it’s abilities. When a getCapabilities request arrives, the server dynamically creates a capabilities document and returns this.

• This is OGC’s formalization of metadata, so important to GEON, ESG, etc.• DescribeFeatureType: After the client receives the capabilities document he/she

can request a more detailed description for any of the features listed in the WFS capabilities document.

• The WFS returns an XML schema that describes the requested feature.• Metadata about a specific entry.

• GetFeature: The client can ask the WFS to return a particular portion of any feature data.

• GetFeature requests contain some property names of the feature and a Filter element to describe the query.

• The WFS extracts the query and bounding box from the filter and queries the feature databases.

• The results obtained from the DB query are converted the feature’s GML format and returned to the client as a FeatureCollection.

2727

Sample Feature- CA Fault Lines<gml:featureMember> <fault> <name>Northridge2</name> <segment>Northridge2</segment> <author>Wald D. J.</author> <gml:lineStringProperty> <gml:LineString

srsName="null"> <gml:coordinates>

-118.72,34.243 -118.591,34.176 </gml:coordinates>

</gml:LineString> </gml:lineStringProperty> </fault> </gml:featureMember>

After receiving getFeature request, WFS decodes this request, creates a DB query from it and queries the database.

WFS then retrieves the features from the database and converts them into GML documents.

Each feature instance is wrapped as a gml:featureMember element.

WFS returns a wfs:FeatureCollection document which includes all featureMembers returned in the query result.

Railroads

RiversBridges

Interstate Highways

90

WFS Server

SQL Query

Railroads

[a-b]

SQ

L Q

uery

Riv

er [a

-d]

Brid

ge [1

-5]

SQL QueryHigway [12-18]

`

ClientWMS

GetFeature

FeatureCollection

Get

Feat

ure

Feat

ureC

olle

ctio

n

•A Web Feature Service can serve multiple feature types data.•WFS returns the results of GetFeature requests as GML documents (Feature Collections).•Clients may include other services as well as humans.•Services include Web Map Servers.

2929

Web Map Services OGC Web Map Servers create digital maps from abstract data

sets retrieved from Web Feature Services.• Layer features are overlayed.

WMS servers can be daisy chained. Our research efforts here are to build Web Service version of the

WMS as well as portlet clients. We are developing backward compatibility with existing WMS.

• Purdue CAAGIS for Indiana maps.• NASA JPL OnEarth WMS.

Future plans are to replace current GMT portal tools with WMS, combine maps with GeoFEST and Virtual California application output.

California boundaries and streams overlaid with QuakeTables faults.

Indiana map data proxied through

legacy WMS.

3131

3232

3333

SERVOGrid Portal Screen Shots

3434

Conclusion and Other Topics We have reviewed Community Grids Lab efforts to

build a GIS-compatible set of data grid services.• This provides a unified architecture for grid execution and

data grid services. Two other important topics:

• Managing messaging between components. NaradaBrokering project led by Dr. Shrideep Pallickara. www.naradabrokering.org

• Managing client environments Component-based portals: www.collab-ogce.org. Command line and scripting interfaces for web services:

www.hpsearch.org.

3535

More Information My email:

[email protected] Overview (submitted to ACES special issue of PAGEOPH).

• http://www.servogrid.org/slide/GEM/SERVO/ISERVO_ACES_PAGEOPH.doc QuakeSim Project Page:

• http://quakesim.jpl.nasa.gov/ QuakeSim Portal:

• http://www.complexity.ucs.indiana.edu:8282. QuakeTables Fault Database:

• http://infogroup.usc.edu:8080/public.html Community Grids Lab GIS Grid Development:

• http://www.crisisgrid.org/.• See particularly http://www.crisisgrid.org/html/servo.html for GML information.• See http://www.crisisgrid.org/html/wfs.html for more on Web Feature Service.• WMS and Information Services information coming soon.

3636

Client Environments for Web Services

Web Services can support multiple types of clients. Tools such as Apache Axis are service hosting environments.

• Manage deployed Web Services

Questions now:• How do I manage several client applications for cooperating services?

• How do I manage multiple clients that collaborate through the same services?

G. Erlebacher, et al.

Client environments for Web Services • Component-based Web portals: www.collab-ogce.org

• Shell environments: Program and prototype complicated interactions between data and

execution services. Manage data streams.

3737

Data flow in applications Real-time processing required. Typically data transfer involves temporary storing of data. This

data may be transferred using files (E.g. Grid FTP).• Every component of the chain processes data from input file, writes

processed data to output file.

• Time and Space critical in real-time applications hence file-based transfer is undesirable for real-time applications.

Tools to automate data transfer and invoke applications (E.g. Grid Ant, Karajan)

3838

HPSearch Binds URI to a scripting language

• We use Mozilla Rhino (A Javascript implementation, Refer: http://www.mozilla.org/rhino), but the principles may be applied to any other scripting language

Every Resource may be identified by a URI and HPSearch allows us to manipulate the resource using the URI.• For e.g. Read from a web address and write it to a local file

x = “http://trex.ucs.indiana.edu/data.txt”;y = “file:///u/hgadgil/data.txt”;Resource r = new Resource(“Copier”);r.port[0].subscribeFrom(x); /* read from */ r.port[0].publishTo(y); /* write to */

f = new Flow();f.addStartActivities(r);f.start(“1”);

Adding support for WS-Addressing construct, under investigation

3939

Architecture Components SHELL

• Front end to scripting. TASK_SCHEDULER (FLOW_ENGINE)

• Distributes tasks among co-operating engines for load-balancing purposes.

WSPROXY -• An AXIS web service wraps an actual service. The behavior of the service

can be controlled by making simple WS calls to this proxy. Can be controlled by any Workflow Engine WSProxy handles streaming data communication on behalf of the

service. • Service only sees I/P and O/P streams. These could be files or a remote

data stream or even a file transferred via HTTP / FTP or results from a database query

• Can be deployed in standard Web Service containers (such as Tomcat)

4040

WSProxy Interfaces Runnable

• More control over execution (start, suspend, resume, stop…)

• Basic idea (read block of data, process it, write it out)

• Ideal for designing quick filtering applications that process data in streams.

Wrapped• Wrap an existing service (Executables [*.exe], Matlab scripts,

shell / Perl scripts etc…)

• Less control, can only start, stop

• Ideal for wrapping existing programs / services to expose as a pluggable component / web service

4141

HPSearchArchitecture Overview

Request Handler

Java script Shell

Task SchedulerFlow Handler

Web Service EP

Other Objects

HPSearch Kernel

URIHandler

DBHandler

WSDLHandler

WSProxyHandler

Request Handler

HPSearch Kernel

HPSearch Kernel

Broker Network

. . .

DataBase

Web Service

FilesSocketsTopics

WSProxy

Service

WSProxy

Service

WSProxy

Service

4242

Applications Streaming Data Filtering

GPS Data

Data FilterFilters the input data to get only the estimate and error

values

RDAHMMAnalyze the data

Matlab PlottingScript

Graph

HPSearchKernel - TSE

Kernel - TSE

Kernel - TSE

(Distributed) Services

Sensor Source

4343

Acknowledgements I have presented work from the following collaborators

• Prof. Geoffrey Fox, Community Grids Lab director

• Harshawardhan Gadgil: HPSearch

• Galip Aydin: Web Feature Service, GML, Sensor Grid

• Ahmet Sayar: Web Map Service and Clients

• Mehmet S. Aktas: GIS Information and Discovery Services This work is supported by NASA Advanced

Information Systems Technology (AIST)

4444

SERVOGrid Basics SERVOGrid is our project to build a distributed computing

infrastructure to support earthquake simulation codes. Distributed Computing Infrastructure

• Services for remotely executing applications, managing distributed files in logical units, monitoring applications, archiving user projects, and so on.

• Variously called “Grids” (DOE), “Cyberinfrastructure” (NSF), “e-Science” (UK e-Science program), ….

The above is sometimes referred to as an “Execution Grid”.• We designed this around a web service architecture.• See links at the end for more information.

But what about the data?• Need ways to access remote data sources through programming

interfaces.• Practice often is to download (via FTP) entire catalogs and hand edit to

get data you want in the format you want.• We need to automate this.

4545

Japan-1

4646

Japan-2

4747

FeatureInfo Popup(non-geometric attributes of the selected features)

4848

Turkey-2

4949

Sample-1

5050

Sample-1 (PI Output Plotting)

5151

Sample-2 (PI Output Plotting)

5252

Sample-3