34
High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Embed Size (px)

Citation preview

Page 1: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

High Performance Web Service Architecture for Sensors and Geographic Information Systems

Galip Aydin

Page 2: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Geographic Information Systems

A Geographic Information System is a system for creating, storing, sharing, analyzing, manipulating and displaying spatial data and associated attributes.

GIS history saw the evolution from mainframe GIS to Desktop GIS to Distributed GIS.

Modern GIS require: Distributed data access for spatial databases Utilizing remote analysis, simulation or visualization

tools.

Page 3: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Traditional Distributed GIS Approach

Problems with traditional approaches: Distributed nature of the geo-data; various client-server

models, databases, HTTP, FTP, RDBs, XML DBs etc. Data format problems, conversion overheads Data processing issues, hardware and software

requirements, COM+/ActiveX, CORBA/IIOP frameworks Which introduce three challenges

Assembling data from distributed repositories Adoption of universal standards for format

interoperability Interoperable services for better utilization of

computational resources

Page 4: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Open Geographic Standards Open GIS Standards bodies aim to make geographic

information and services neutral and available across any network, application, or platform.

Two major standard bodies: OGC and ISO/TC211, former being most popular

OGC Specifications are widely accepted: Data Format Specs: GML, SensorML, O&M Service Specs: WFS, WMS, WCS

OGC Services are HTTP GET/POST based; limited data transport capabilities (HTTP, FTP, files etc.)

Not Web Services; tightly coupled, point to point communication results in centralized, synchronous applications.

Page 5: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Motivations Lack of service orchestration capabilities

Complex problems require GIS applications to collaborate. Coupling data sources to scientific applications Data transport requirements Proliferation of Sensors

Ability to analyze data on-the-fly, continuous streaming support, scalable systems for addition of new sensors.

High performance and high rate messaging Real-time data access, rapid response systems, crisis

management etc. From the Grids perspective

To apply general Grid/Distributed computing principles to GIS Investigate how to integrate with geophysical and other scientific

applications

Page 6: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Motivating Use Cases

Pattern Informatics Earthquake forecasting code developed by Prof. John Rundle

(UC Davis) and collaborators, uses seismic archives. Regularized Dynamic Annealing Hidden Markov

Method (RDAHMM) Time series analysis code, can be applied to GPS and seismic

archives, can be applied to real-time data. Interdependent Energy Infrastructure Simulation

System (IEISS) Models infrastructure networks (e.g. electric power systems and

natural gas pipelines) and simulates their physical behavior, interdependencies between systems.

SOPAC GPS Networks provide real-time messages.

Page 7: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Research Issues 1

Applying Web Service principles to GIS data services Orchestration of Services, workflows, simple services are

not suitable for large data sets and where quick response is required

High Performance support in GIS services. Interoperability

The system should bridge GIS and Web Service communities by adapting standards from both.

Other GIS applications should be able to consume data without having to do costly format conversions.

Page 8: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Research Issues 2

Scalability The system should be able to handle high volume and high

rate data transport and processing. Plugging new sensors, data sources or geoprocessing

applications should not degrade system’s overall performance.

Flexibility and extendibility How to develop real-time services to process sensor data on

the fly. Ability to add new filters without system failures.

Quality of Service Issues Is latency introduced by services in processing real-time

sensor data acceptable?

Page 9: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

SOA for GIS – Geophysical Data Grid

We utilize Web Services to realize Service Oriented Architecture, OGC data formats and application interfaces for interoperability at both levels.

GIS Data Grid Properties Based on the sources geospatial data can be seen as

archival and real-time data. The architecture provides standard control and access interfaces for both types.

Supports alternate transport and representation schemes, uses topic based messaging infrastructure for large volume data transport.

UDDI based FTHPIS as services registry. Streaming and non-streaming services to access archived

data. Real-Time and near real-time services for accessing sensor

metadata and sensor measurements.

Page 10: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Geophysical Data Grid Architecture

Archival Data Grid Real-Time Data Grid

Page 11: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

GIS Grid 1 - Archival Data Services

Web Feature Service is the default OGC specification for vector data.

We have built Web Service version of WFS for accessing geospatial data on distributed databases.

The first Web Service version of WFS has been successfully used in several scientific workflows with other services (WMS, HPSearch, FTHPIS).

WFS can access multiple distributed databases, can query other WFSs for remote features.

Problems with Web Service version of the WFS Request-response, not asynchronous, Performance: GI Services are not designed to handle non-trivial

data transfers. Large data requests, SOAP overhead. XML Encoding: Size of the geospatial data increases with GML

encoding which increases transfer times, or may cause exceptions

Page 12: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

WFS Performance Improvements Streaming WFS

To improve performance of the WFS: Utilized publish/subscribe messaging system for high

performance data transfer. Similar to WFS but data and control channel separation, allows one to many data distribution.

Used streaming database connection (MySQL) for faster retrieval of the query results, and lower GML creation overhead.

Binary XML Frameworks are integrated for reducing XML payload size which improves transfer times.

Binding data transfer to Grid messaging middleware reduces SOAP creation overhead.

Page 13: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

WFS Interaction with services and data sources

Page 14: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

GIS Grid Example – IEISS Integration

WMS – Ahmet SayarUDDI, Context Service – Mehmet Aktas

Page 15: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Streaming WFS PerformanceNB Transfer Time Comparison

TCP NB Server @ Bloomington

0

100

200

300

400

500

600

700

800

900

1000

500 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000

Number Of Features

Tim

e (m

s)

XML BNUX FI

NB Transfer Time ComparisonTCP

NB Server @ Indianapolis

0

200

400

600

800

1,000

1,200

1,400

500 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000

Number Of Features

Tim

e (

ms)

XML BNUX FI

NB Transfer Time ComparisonTCP

NB Server @ La Jolla, CA

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

18,000

500 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000

Number Of Features

Tim

e (m

s)

XML BNUX FI

We test the system for up to 10.000 features The tests reveal the performance of the streaming service with and without Binary XML integration We use BNUX and Fast Infoset Binary XML Frameworks for compressing the GML FeatureCollection documents The BNUX and FI timings include encoding and decoding costs

Page 16: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

GIS Grid 2 - Real-Time Data Services

Sensors and sensor networks are being deployed for measuring various geo-physical entities.

Sensors and GIS are closely related. Sensor measurements are used by GIS for statistical or analytical purposes.

With the proliferation of the sensors, data collection and processing paradigms are changing.

Most scientific geo-applications are designed to work with archived data.

Critical Infrastructure Systems and Crisis Management environments require fast and accurate access to real-time sources and a flexible/pluggable architecture for geoprocessing of the data.

Page 17: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

SensorGrid Architecture

Major components: Real-Time filters Grid Messaging Substrate Information Service

Filters can be run as Web Services to create workflows.

Filter Chains can be deployed for complex processing.

Streaming messaging provide high-performance transfer options.

Page 18: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Real-Time Filters

Real-time data processing is supported by employing filters around publish/subscribe messaging system.

The filters are extended from a generic class to inherit publish and subscribe capabilities.

They can be connected in parallel or serial as chains to solve complex problems.

Input Signal Output SignalFilter

Page 19: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Filter Metadata and Chains

Parallel Operation

Serial Operation

Page 20: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Use Case - GPS Sensors

A good example for scientific sensors are GPS station networks. GPS measurements are used for determining post-seismic deformation, understanding long-term crustal movement etc.

SOPAC GPS networks: 8 networks for 80 stations produce 1Hz high resolution data. Socket based real-time binary-RYO format access is available,

but not utilized! We developed filters to provide multiple format (RYO, ASCII,

GML) real-time streaming access. OHIO principle and chain of filters.

We use publish/subscribe based NaradaBrokering for managing real-time streams, topics for hierarchical organization of the sensors.

Page 21: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

SOPAC Real-Time Filters for GPS Streams

Page 22: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Application Integration with Real-Time Filters

Station Monitor Filter records real-time positions for 10 minutes and calculates position changes

Graph Plotter Application creates visual representation of the positions.

RDAHMM Filter records real-time positions for 10 minutes and invokes RDAHMM application which determines state changes in the XYZ signal.

Graph Plotter Application creates visual representation of the RDAHMM output.

Page 23: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

AJAX and Real-Time positions on Google maps

Page 24: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Recording and Replaying Sensor Streams Filters can be used to record and replay

scenarios, such as Earthquakes in GPS case. We developed RYO Recorder and RYO

Publisher Filters. The RYO Recorder creates daily archives of the

GPS Streams. RYO Publisher can be used to play daily or

certain segments of the records. We replayed the 2004 Southern California

Earthquake using Parkfield GPS network archive

Page 25: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

SensorGrid Performance Tests

Two Major Goals: System Stability and ScalabilityEnsuring stability of the distributed Filter

Services for continuous operation.Finding the maximum number of publishers

(sensors) and clients that can be supported with a single broker.

Investigate if system scales for large number of sensors and clients.

Page 26: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Test Methodology

The test system consists of a NaradaBrokering server and a three-filter chain for publishing, converting and receiving RYO messages.

We take 4 timings for determining mean end-to-end delivery times of GPS measurements.

The tests were run at least for 24 hours. GridFarm001-008 servers are used in these tests.

Ttransfer = (T2 – T1) + (T4 – T3)

Page 27: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

1- System Stability Test The basic system with

three filters and one broker.

The figure shows average results for every 30 minutes.

The average transfer time shows the continuous operation does not degrade the system performance.

System Stability Test

0

1

2

3

4

5

6

Time of the Day

Tim

e (m

s)Transfer Time Standard Deviation

Page 28: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

2 – Multiple Publishers Test

We add more GPS networks by running more publishers.

The results show that 1000 publishers can be supported with no performance loss. This is an operating system limit.

Multiple Publishers Test

0

1

2

3

4

5

6

Time of the Day

Tim

e (

ms)

Transfer Time Standard Deviation

Page 29: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

3 – Multiple Clients Test

We add more clients by running multiple Simple Filters which subscribe to the same ASCII topic.

The system can support as many as 1000 clients with very low performance decrease.

Adding clients

1000 Clients

Page 30: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Extending Scalability

The limit of the basic system appears to be 1000 clients or publishers.

This is due to an Operating System restriction of open file descriptors (1024 for Red Hat Linux).

To overcome this limit we create NaradaBrokering networks with linking multiple brokers.

We run 2 brokers to support 1500 clients. Number of brokers can be increased indefinitely, so we

can potentially support any number of publishers and subscribers.

Page 31: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

4 – Multiple Brokers Test Messages published to

first broker can be received from the second broker.

We take timings on each broker.

We connect 750 clients to each broker and run for 24 hours.

The results show that the performance is very good and similar to single broker test.

Page 32: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

4 – Multiple Brokers Test

Multiple Broker Test Broker 1

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

0:00

1:30

3:00

4:30

6:00

7:30

9:00

10:3

0

12:0

0

13:3

0

15:0

0

16:3

0

18:0

0

19:3

0

21:0

0

22:3

0

Time Of The Day

Tim

(m

s)

Transit Time Standard Deviation

750 Clients

Multiple Broker Test Broker 2

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

40.00

0:00

1:30

3:00

4:30

6:00

7:30

9:00

10:3

0

12:0

0

13:3

0

15:0

0

16:3

0

18:0

0

19:3

0

21:0

0

22:3

0

Time Of The Day

Tim

e (m

s)

Transfer Time Standard Deviation

750 Clients

Page 33: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Real-Time Filters Test Results

The RYO Publisher filter runs at 1Hz and publishes 24-hour archive of the CRTN_01 GPS network, which contains 9 GPS stations.

The single broker configuration can support 1000 clients or publishers (GPS networks - 9000 individual stations).

The system can be scaled up by creating NaradaBrokering broker networks.

Message order was preserved in all tests.

Page 34: High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin

Contributions A SOA approach to create a common platform to support both

archival and real-time geospatial data in data-centric Grids. Merging Web Services and Open Geographic Standards for

supporting interoperability at both data and application levels. We have shown that the GIS Services can be implemented as

streaming services. Integration of Binary XML Frameworks with the Streaming Services

shows performance gains for long network distances. We have shown that the Sensor Grids can be built on top of the

publish/subscribe middleware. Real-Time continuous data support is realized in a Service

Architecture. Scalable architecture implementation for large number of sensor

networks.