26
Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama, Dept. of Atmospheric Sciences December 6, 2006

Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Embed Size (px)

Citation preview

Page 1: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Part I

DataFed

An Agile Distributed Air Quality Data System

Rudolf B. HusarWashington University, St. Louis

Seminar Presented atUniversity of Alabama, Dept. of Atmospheric Sciences

December 6, 2006

Page 2: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Revolution in Sensing (1990s) & Delivery (2000+)

Real-time Air Pollution Sensing and Reporting

High Resolution Satellite DataSurface PM25 and Ozone Data

Smoke Plumes

Page 3: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Changes in Air Quality Management

Command & Control

Weight of Evidence

Flexible NAAMS

Rigid Monitoring

Page 4: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

• The data life cycle consists of the acquisition and the usage parts

Usage ActivitiesData Acquisition

Data Acquisition and Usage Activities(Select View Show, click to step through PPT)

• The acquisition part processes the sensory data by firmly linked procedures

The focus is on data usage activities

• The usage activities are more iterative, dynamic procedures

• The collected and cleaned data are stored in the repository

Data Repository

• The usage cycle transform data into knowledge for decision making

Decisions

Page 5: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Stages of AQ Data Flow and Value-Adding Processes

Domain ProcessingData Sharing

Std

. In

terf

ace

Gen. ProcessingS

td.

Inte

rface

Data

Control

Reports

Reporting

Obs. & Models Decision Support System

Analyzing

Filter/IntegrateAggregate/FuseCustom Analysis

Organizing

DocumentStructure/FormatInterfacing

Characterizing

Display/BrowseCompare/Fuse Characterize

Valu

e-A

dd

ing

P

rocesses

Reporting

Inclusiveness Iterative/Agile Dynamic Report

Page 6: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Generic Decision Support for Air Quality Decisions

Global Earth Observing System of Systems

GEOSS

Architecture Framework

Knowledge into the Minds of

Regulatory Analysts

Knowledge into the Minds of

Technical Analysts

Observations

Reports:Model Forecasts,

Obs. Evidence

Models

DecisionsKnowledge

into the Minds of

Decision- making managers

Decision Support System

Page 7: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Characteristics of System of Systems (SoS)

• Autonomous constituents managed/operated independently• Independent evolution of each constituent• Displays emergent behavior

Must recognize, manage, exploit the characteristics:

• No stakeholder has complete SoS insight• Central control is limited; distributed control is essential• Users, must be involved throughout the life of a SoS

Page 8: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Differentiating System Engineering and System of Systems Engineering

“System of Systems Engineering for Air Force Capability Development”, 2005

Page 9: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Air Quality ClusterTechTrack

Earth Science Information PartnersPartners

• NASA• NOAA• EPA

• (?)• USGS• DOE• NSF• Industry…

Air Quality Information System Architecture

Flow of DataFlow of Control

Air Quality Data

Meteorology Data

Emissions Data

Informing Public

AQ Compliance

Status and Trends

Network Assess.

Tracking Progress

Data to Knowledge Transformation

Data Products Knowledge ProductsMediators

Page 10: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

• Data are distributed geographically by autonomous providers

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

• Data includes emissions

Emission

Emission

Emission

Emission

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

Information Providers: Geography, Content, Agency, Form

• Data includes emissions, ambient data,

Ambient

Ambient

Ambient

Ambient

Emission

Emission

Emission

Emission

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

• Data includes emissions, ambient data, satellite data

Satellite

Satellite

SatelliteSatellite

Ambient

Ambient

Ambient

Ambient

Emission

Emission

Emission

Emission

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

• Data includes emissions, ambient data, satellite data and model output

Model

Model

ModelModel

Satellite

Satellite

SatelliteSatellite

Ambient

Ambient

Ambient

Ambient

Emission

Emission

Emission

Emission

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

• Data are provided by multiple agencies: EPA, NOAA, NASA and others

NASAMission

NOAAGASP

NASAIDEA

NASA DAACs

NOAA ASOS

EPA-AQSDataMart

EPA AIRNow

RPO VIEWS

FS FireInv

State/LocalEmission

EPA NEISGEI

EPA NEI

NOAA WeaMod

EPAAQModel

NOAA Forecast

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

NASA DAACs

NOAA GASP

NASAIDEA

NASA Missions

EPA NEI

EPA NEISGEI

FS FireInv

State/Local Emission

NOAA ASOS

RPO VIEWS

EPA AIRNow

EPA-AQS AIRS

NOAA WeaMod

EPA AQModel

NASA GloModel

NOAA Forecast

• Furthermore, data are provided in varied formats and access protocols

Emission

Ambient

Satellite

Model

EPA

NOAA

NASA

Other

Content | Agency | Form

• Data on Internet are geography-independent and can be ‘linearized’

Internet

NASA DAACs

EPA R&DModel

EPA AIRNow

others

Page 11: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

• Users are distributed geographically

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

Policy

Policy

Policy

• Users includes policy makers

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

• Users includes policy makers, the public

Policy

Policy

Policy PublicPublic

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

• Users includes policy makers, the public, AQ managers

Policy

Policy

Policy PublicPublic

Manager Manager

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

and scientist

Policy

Policy

Policy PublicPublic

Manager ManagerScientist Scientist

Scientist

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

• Users are affiliated with multiple agencies: EPA, NOAA, NASA, as well as others

Policy

Policy

Policy PublicPublic

Manager ManagerScientist Scientist

Scientist

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

Users: By Types, Agency, Info Needs

• Furthermore, users need various types of information provided in multiple formats

Policy

Manager

Policy

Scientist

ManagerScientistScientist

Policy

Public Public

EPA

NOAA

NASA

Other

Stakeholder | Agency | Form

Policy

Manager

Public

Scientist

• Since the users are also on the Internet, their geographic location is irrelevant

Public

Manager

Scientist

Internet

other

Page 12: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

• The info system transforms the data into info products for each user

Agile Information System: Data Access, Processing and Products

Public

Manager

Scientist

Users

otherData

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Page 13: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

• In the first stage the heterogeneous data are prepared for uniform access

Agile Information System: Data Access, Processing and Products

• The info system transforms the data into info products for each user

Uniform Access

Public

Manager

Scientist

Users

otherData

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Page 14: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Agile Information System: Data Access, Processing and Products

• The second stage performs filtering, aggregation, fusion and other operations

• In the first stage the heterogeneous data are prepared for uniform access

• The info system transforms the data into info products for each user

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Public

Manager

Scientist

Users

otherData

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Page 15: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Agile Information System: Data Access, Processing and Products

• The second stage performs filtering, aggregation, fusion and other operations

• In the first stage the heterogeneous data are prepared for uniform access

• The info system transforms the data into info products for each user

• The third stage prepares and delivers the needed info products

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

otherData

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Page 16: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Agile Information System: Data Access, Processing and Products

• The second stage performs filtering, aggregation, fusion and other operations

• In the first stage the heterogeneous data are prepared for uniform access

• The info system transforms the data into info products for each user

• The third stage prepares and delivers the needed info products

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

• While the data flows from the provider, the flow of control is from the user

Page 17: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

The next three slides describe the key technologies used in the creation of an adaptable and responsive air quality information system.

OGC data access protocols and standard formats facilitate loose coupling between data on the internet and processing services.

For air quality, the Web Coverage Service (WCS), provides a universal simple query language for requesting data as where, when, what. That is: geographic (3D bounding box), time range and parameter.

The Web Map Service (WMS) and Web Feature Service (WFS) are also useful.

The use of standard data physical data formats and naming conventions elevates the syntactic and semantic interoperability.

Within DataFed all data access services are implemented as WCS or WMS and optionally WFS. General format adapter components permit data request in a variety of standard formats.

GetCapabilities

GetData

Capabilities, ‘Profile’

Data

Where? When? What? Which Format?

Server

Back End S

td.

Inte

rface

Client

Front EndS

td.

Inte

rface

Query GetData Standards

Where? BBOX OGC, ISO

When? Time OGC, ISO

What? Temperature CF

Format netCDF, HDF.. CF, EOS, OGC

T2T1

Loosely Coupled Data Access through Standard Protocols

Page 18: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Web Services and Workflow for Loose Coupling

Service Broker

Service Provider

PublishFind

BindServiceUser

Web Service Interaction Service Chaining & Workflow

Web Services Triad:Publish – Find – Bind

Workflow Software:Dynamic Programming

Page 19: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Uniform Access

Data Processing Web Service Chain

Custom Processing

SciFlo

DataFed

Products Reports

Forecast

Compli.

Other

Science

Public

Manager

Scientist

Users

other

Control

Data

Acq

uis

itio

n Provider

NASA DAACs

EPA Model

EPA AIRNow

others

Data

Collaborative Reporting and Dynamic Delivery

Co Writing - Wiki

ScreenCast

Analysis Reports:

Information supplied by manyNeeds continuous program feedbackReport needs many authorsWiki technologies are for collaborative writing

Dynamic Delivery:

Much of the content is dynamicAnimated presentations are compellingMovies and screencasts are for dynamic delivery

Page 20: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Approach: Mediation Between Users and Data Providers

DataFed assumes spontaneous, autonomous data providersNon-intrusively wraps datasets for access by web servicesMediates, homogenizes data views. e.g. geo-spatial, time...

Page 21: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

ScientistScience

DAACs

• Current info systems are project/program oriented and provide end-to-end solutions

Info UsersData Providers Info System

AIRNowPublicAIRNow

ModelCompliance

Manager

‘Stovepipe’ and Federated Usage Architectures Landscape

• Part of the data resources of any project can be shared for re-use through DataFed

• Through the Federation, the data are homogenized into multi-dimensional cubes

• Data processing and rendering can then be performed through web services

• Each project/program can be augmented by Federation data and services

Page 22: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

DataFed User Tools

– Data Catalog– Data Browser– PlumeSim, Animator– Combined Aerosol Trajectory Tool (CATT)

Consoles: Data from diverse sources are displayed to create a rich context for exploration and analysis

CATT: Combined Aerosol Trajectory Tool for the browsing backtrajectories for specified chemical conditions

Viewer: General purpose spatio-temporal data browser and view editor applicable for all DataFed datasets

Page 23: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Federated Datasets

• Data are accessed from autonomous, distributed providers• DataFed ‘wrappers’ provide uniform geo-time referencing• Tools allow space/time overlay, comparisons and fusion

Near Real Time Data IntegrationDelayed Data Integration

Surface Air Quality AIRNOW O3, PM25 ASOS_STI Visibility, 300 sitesMETAR Visibility, 1200 sitesVIEWS_OL 40+ Aerosol Parameters

SatelliteMODIS_AOT AOT, Idea ProjectGASP Reflectance, AOTTOMS Absorption Indx, Refl.SEAW_US Reflectance, AOT

Model OutputNAAPS Dust, Smoke, Sulfate, AOTWRF Sulfate

Fire DataHMS_Fire Fire PixelsMODIS_Fire Fire Pixels

Surface MeteorologyRADAR NEXTRADSURF_MET Temp, Dewp, Humidity…SURF_WIND Wind vectorsATAD Trajectory, VIEWS locs.

Page 24: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

A Sample of Datasets Accessible through ESIP MediationNear Real Time (~ day)

It has been demonstrated (project FASTNET) that these and other datasets can be accessed, repackaged and delivered by AIRNow through ‘Consoles’

MODIS Reflectance

MODIS AOT TOMS Index

GOES AOT

GOES 1km Reflec

NEXTRAD Radar

MODIS Fire Pix

NRL MODEL

NWS Surf Wind, Bext

Page 25: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

Web Services: Building Blocks of

DataFed Programming

Access, Process, Render Data by Service Chaining

NASA SeaWiFS Satellite

NOAA ATAD Trajectory

OGC Map Boundary

RPO VIEWS Chemistry

Data Access

Data Processing

Layer OverlayLAYERS

Web Service Composition

Page 26: Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,

InteroperabilityWrappers and Adapters