39
A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC Robert Wolfe, NASA/GSFC Contributions from R. Strub, T. Hearty, Y-I Won, M. Hegde, S. Zednik, P. West, N. Most, S. Ahmad, C. Praderas, K. Horrocks, I. Tchered, and A. Rezaiyan-Nojani Advancing Collaborative Connections for Earth System Science (ACCESS) Program Project Page: http://tw.rpi.edu/web/project/DQSS

A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

Embed Size (px)

Citation preview

Page 1: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

A Data Quality Screening Service for Remote Sensing

Data

Christopher Lynnes, NASA/GSFC (P.I.)Edward Olsen, NASA/JPL

Peter Fox, RPIBruce Vollmer, NASA/GSFCRobert Wolfe, NASA/GSFC

Contributions fromR. Strub, T. Hearty, Y-I Won, M. Hegde, S. Zednik, P. West,

N. Most, S. Ahmad, C. Praderas, K. Horrocks, I. Tchered, and A. Rezaiyan-Nojani

Advancing Collaborative Connections for Earth System Science (ACCESS) Program

Project Page: http://tw.rpi.edu/web/project/DQSS

Page 2: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

2

Outline

Why a Data Quality Screening Service?

Making Quality Information Easier to Use via the Data Quality Screening Service (DQSS)

DEMO

DQSS Under the Hood

DQSS Status

DQSS Plans

Page 3: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

3

Why a Data Quality Screening Service?

Page 4: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

4

The quality of data can vary considerably

AIRS Variable Best (%)

Good (%)

Do Not Use (%)

Total Precipitable Water

38 38 24

Carbon Monoxide 64 7 29Surface Temperature

5 44 51Version 5 Level 2 Standard Retrieval Statistics

Page 5: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

5

Quality schemes can be relatively simple…

Total Column Precipitable Water

Qual_H2O

Best Good Do Not Usekg/m2

Page 6: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

6

…or they can be more complicated

Hurricane Ike, viewed by the Atmospheric Infrared Sounder (AIRS)

PBest : Maximum pressure for which quality value is

“Best” in temperature profiles

Air Temperatureat 300 mbar

Page 7: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

7Quality flags are also sometimes packed together

into bytes

Cloud Mask Status Flag0=Undetermined1=Determined

Cloud Mask Cloudiness Flag0=Confident cloudy1=Probably cloudy2=Probably clear3=Confident clear

Day / Night Flag0=Night1=Day

Sunglint Flag0=Yes1=No

Snow/ Ice Flag0=Yes1=No

Surface Type Flag0=Ocean, deep lake/river1=Coast, shallow lake/river2=Desert3=Land

Big-endian arrangement for the Cloud_Mask_SDS variable in atmospheric products from Moderate Resolution Imaging

Spectroradiometer (MODIS)

Page 8: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

8

Current user scenarios...

Nominal scenario Search for and download data Locate documentation on handling quality Read & understand documentation on quality Write custom routine to filter out bad pixels

Equally likely scenario (especially in user communitiesnot familiar with satellite data) Search for and download data Assume that quality has a negligible effect

Repeat for

each user

Page 9: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

9The effect of bad qualitydata is often not

negligible

Total Column Precipitable Water Quality

Best Good Do Not Usekg/m2

Hurricane Ike, 9/10/2008

Page 10: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

10Neglecting quality may introduce bias (a more subtle

effect)AIRS Relative Humidity Comparison against Dropsonde with and without Applying PBest Quality Flag Filtering

Boxed data points indicate AIRS RH data with dry bias > 20%

From a study by Sun Wong (JPL) on specific humidity in the Atlantic Main Development Region for Tropical

Storms

Page 11: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

11Percent of Biased Data in MODIS Aerosols Over Land Increases as Confidence Flag

Decreases

Bad

Marginal

Good

Very Good

0% 20% 40% 60% 80% 100%

Compliant*Biased LowBiased High

*Compliant data are within + 0.05 + 0.2τAeronet

Statistics derived from Hyer, E., J. Reid, and J. Zhang, 2010, An over-land aerosol optical depth data set for data assimilation by filtering, correction, and aggregation of MODIS Collection 5 optical depth retrievals, Atmos. Meas. Tech. Discuss., 3, 4091–4167.

Page 12: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

12

Making Quality Information Easier to Use

via the Data Quality Screening Service (DQSS)

.

Page 13: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

13

DQSS Team

P.I.: Christopher Lynnes

Software Implementation: Goddard Earth Sciences Data and Information Services Center Implementation: Richard Strub Local Domain Experts (AIRS): Thomas Hearty and Bruce

Vollmer

AIRS Domain Expert: Edward Olsen, AIRS/JPL

MODIS Implementation Implementation: Neal Most, Ali Rezaiyan, Cid Praderas,

Karen Horrocks, Ivan Tchered Domain Experts: Robert Wolfe and Suraiya Ahmad

Semantic Engineering: Tetherless World Constellation @ RPI Peter Fox, Stephan Zednik, Patrick West

Page 14: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

14The DQSS filters out bad pixels for the user

Default user scenario Search for data Select science team recommendation for quality

screening (filtering) Download screened data

More advanced scenario Search for data Select custom quality screening parameters Download screened data

Page 15: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

15

DQSS replaces bad-quality pixels with fill values

Mask based on user criteria(Quality level

< 2)

Good quality data pixels

retained

Output file has the same format and structure as the input file (except for extra mask and original_data fields)

Original data array(Total column precipitable water)

Page 16: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

16Visualizations help users see the effect of different quality filters

Page 17: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

17DQSS can encode the science team recommendations on

quality screening

AIRS Level 2 Standard Product Use only Best for data assimilation uses Use Best+Good for climatic studies

MODIS Aerosols Use only VeryGood (highest value) over

land Use Marginal+Good+VeryGood over

ocean

Page 18: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

18

Initial settings are based on Science Team recommendation.

(Note: “Good” retains retrievals that Good or better).

You can choose settings for all parameters at once...

...or variable by variable

Or, users can select their own criteria...

Page 19: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

19

DEMO

http://mirador.gsfc.nasa.gov (Search for ‘AIRX2RET’)

Page 20: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

20

DQSS Under the Hood

Page 21: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

21

DQSS Flow

user selection

ancillary file

Screener

Quality Ontology

data file

quality mask

screened data file

EndUser

data file w/ mask

Masker

Ontology Query

Page 22: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

22DQSS Ontology(The Whole Enchilada)

Page 23: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

23

DQSS Ontology (Zoom)

Page 24: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

24

DQSS Status

Page 25: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

25Significant Accomplishments

Release 1.0 of DQSS for AIRS Level 2 Standard Retrieval Technology Readiness Level (TRL) = 9 (for AIRS L2) Includes Quality Impact views Disclosure of Invention (NF 1679) filed Announced to AIRS Registered Data Product User Community

(~770)

Papers and Presentations Managing Data Quality for Collaborative Science workshop (peer-

reviewed paper) Sounder Science meeting ESDSWG Poster A-Train User Workshop (part of GES DISC presentation) AGU: Ambiguity of Data Quality in Remote Sensing Data

Ontology (v. 2.4) to accommodate both AIRS and MODIS L2

Page 26: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

26

Current Status

DQSS is operational for AIRS L2 Standard Products DQSS is offered through the Mirador data

search interface* at the GES DISC

DQSS has been refactored to work in LAADS/MODAPS environment 21/2 month delay due to refactoring and

(successful) audit of NPR 7150.2 compliance by NASA’s Office of the Chief Engineer

Schedule no longer has room for client-side screening

But maybe...(to be continued)* http://mirador.gsfc.nasa.gov

Page 27: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

27

Metrics

DQSS is included in web access logs sent to EOSDIS Metrics System (EMS) Tagged with “Protocol” DQSS EMS -> ESDS bridge implemented

Metrics from EMS: Users: 50 Downloads: 9381

Page 28: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

28

DQSS Refactoring

GES DISC MODAPS/LAADS

Synchronous vs. Asynchronous

Synchronous Asynchronous batch post-process

Screening machines

Archive/Distribution

Processing minions (no outward connections allowed)

Page 29: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

29Refactored DQSS encapsulates ontology and

screening parameters

MODAPS Minions

LAADSWeb GUI

DQSS Ontolog

y Service

data product

screening options

MODAPSDatabase

screeningparameters

(XML)

selected options

screening parameters

(XML)

screening parameters

(XML)

Encapsulation enables reuse in other data centers with diverse environments.

Page 30: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

30

DQSS Plans

Page 31: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

31

2010 Plans

Adding more products MODIS L2 Water Vapor – 2/2011 MODIS L2 Clouds and Aerosols – 5/2011 Microwave Limb Sounder (MLS) L2 – 6/2011 Ozone Monitoring Instrument (OMI) L2 – 8/2011

Integrate with other services AIRS Variable Subsetting & NetCDF Reformatting – 3/2011 OPeNDAP – 6/2011 MODAPS Web Services – 10/11

Refactor for reusability & sustainability – 9/2011

Release – 11/2011

No longer planned: client-side screening Except...refactored DQSS may make this possible...

Page 32: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

32

Combine DQSS with Other Services

AIRS Subsetting, quality screening and reformatting to netCDF are currently mutually exclusive Combining them will also raise awareness of quality

screening DQSS will also be more usable to communities more

comfortable with netCDF (e.g., modeling community)

OPeNDAP OPeNDAP, Inc. is working to support back-end calls to

REST services Gateway is based on ACCESS-05 project (“CEOP

Satellite Data Server”) Will allow access to DQSS via analysis and

visualization clients (e.g., Panoply, IDV) May be reused for other back-end REST services

Page 33: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

33

Refactoring for Release

Proposal schedule included refactoring for long-term sustainability A key element of successful technology

infusion

Current refactoring for MODIS environment... Gives us a head start May also give us an avenue to client-side

capabilities

Page 34: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

34

New Client Side Concept

Ontology

Web Form

Data Provider

Masker

Screener

dataset-specific

instance infodata files

screening criteria

XML File

XML File

data files

GES DISC End User

Page 35: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

35

DQSS Recap

Screening satellite data can be difficult and time consuming for users

The Data Quality Screening System provides an easy-to-use service

The result should be: More attention to quality on users’ part More accurate handling of quality information… …With less user effort

Page 36: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

36

Backup Slides

Page 37: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

37DQSS Target Uses and Users

  RoutineVisual-ization

Quick Recon.

Machine-level Metrics

Interdisciplinary X X      

Educational X X      

Expert/Power ?   X X  

Applications       X  

Algorithm Developers         X

Page 38: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

38

ESDSWG Activities

Contributions to all 3 TIWG subgroups Semantic Web:

Contributed Use Case for Semantic Web tutorial @ ESIP Participation in ESIP Information Quality Cluster

Services Interoperability and Orchestration Combining ESIP OpenSearch with servicecasting and

datacasting standards Spearheading ESIP Federated OpenSearch cluster

5 servers (2 more soon); 4+ clients Now ESIP Discover cluster (see above)

Processes and Strategies: Developed Use Cases for Decadal Survey Missions:

DESDynI-Lidar, SMAP, and CLARREO

Page 39: A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC

39

Projected TRL Breakdown

Dec-0

6

Jan-

07

Feb-

07

Mar

-07

Apr-0

7

May

-07

Jun-

07

Jul-0

7

Aug-

07

Sep-

07

Oct-0

75

6

7

8

9

AIRS L2MODISOPeNDAP + DQSSClient-Side Screening