Upload
liliana-mart
View
217
Download
3
Tags:
Embed Size (px)
Citation preview
A Data Quality Screening Service for Remote Sensing
Data
Christopher Lynnes, NASA/GSFC (P.I.)Edward Olsen, NASA/JPL
Peter Fox, RPIBruce Vollmer, NASA/GSFCRobert Wolfe, NASA/GSFC
Contributions fromR. Strub, T. Hearty, Y-I Won, M. Hegde, S. Zednik, P. West,
N. Most, S. Ahmad, C. Praderas, K. Horrocks, I. Tchered, and A. Rezaiyan-Nojani
Advancing Collaborative Connections for Earth System Science (ACCESS) Program
Project Page: http://tw.rpi.edu/web/project/DQSS
2
Outline
Why a Data Quality Screening Service?
Making Quality Information Easier to Use via the Data Quality Screening Service (DQSS)
DEMO
DQSS Under the Hood
DQSS Status
DQSS Plans
3
Why a Data Quality Screening Service?
4
The quality of data can vary considerably
AIRS Variable Best (%)
Good (%)
Do Not Use (%)
Total Precipitable Water
38 38 24
Carbon Monoxide 64 7 29Surface Temperature
5 44 51Version 5 Level 2 Standard Retrieval Statistics
5
Quality schemes can be relatively simple…
Total Column Precipitable Water
Qual_H2O
Best Good Do Not Usekg/m2
6
…or they can be more complicated
Hurricane Ike, viewed by the Atmospheric Infrared Sounder (AIRS)
PBest : Maximum pressure for which quality value is
“Best” in temperature profiles
Air Temperatureat 300 mbar
7Quality flags are also sometimes packed together
into bytes
Cloud Mask Status Flag0=Undetermined1=Determined
Cloud Mask Cloudiness Flag0=Confident cloudy1=Probably cloudy2=Probably clear3=Confident clear
Day / Night Flag0=Night1=Day
Sunglint Flag0=Yes1=No
Snow/ Ice Flag0=Yes1=No
Surface Type Flag0=Ocean, deep lake/river1=Coast, shallow lake/river2=Desert3=Land
Big-endian arrangement for the Cloud_Mask_SDS variable in atmospheric products from Moderate Resolution Imaging
Spectroradiometer (MODIS)
8
Current user scenarios...
Nominal scenario Search for and download data Locate documentation on handling quality Read & understand documentation on quality Write custom routine to filter out bad pixels
Equally likely scenario (especially in user communitiesnot familiar with satellite data) Search for and download data Assume that quality has a negligible effect
Repeat for
each user
9The effect of bad qualitydata is often not
negligible
Total Column Precipitable Water Quality
Best Good Do Not Usekg/m2
Hurricane Ike, 9/10/2008
10Neglecting quality may introduce bias (a more subtle
effect)AIRS Relative Humidity Comparison against Dropsonde with and without Applying PBest Quality Flag Filtering
Boxed data points indicate AIRS RH data with dry bias > 20%
From a study by Sun Wong (JPL) on specific humidity in the Atlantic Main Development Region for Tropical
Storms
11Percent of Biased Data in MODIS Aerosols Over Land Increases as Confidence Flag
Decreases
Bad
Marginal
Good
Very Good
0% 20% 40% 60% 80% 100%
Compliant*Biased LowBiased High
*Compliant data are within + 0.05 + 0.2τAeronet
Statistics derived from Hyer, E., J. Reid, and J. Zhang, 2010, An over-land aerosol optical depth data set for data assimilation by filtering, correction, and aggregation of MODIS Collection 5 optical depth retrievals, Atmos. Meas. Tech. Discuss., 3, 4091–4167.
12
Making Quality Information Easier to Use
via the Data Quality Screening Service (DQSS)
.
13
DQSS Team
P.I.: Christopher Lynnes
Software Implementation: Goddard Earth Sciences Data and Information Services Center Implementation: Richard Strub Local Domain Experts (AIRS): Thomas Hearty and Bruce
Vollmer
AIRS Domain Expert: Edward Olsen, AIRS/JPL
MODIS Implementation Implementation: Neal Most, Ali Rezaiyan, Cid Praderas,
Karen Horrocks, Ivan Tchered Domain Experts: Robert Wolfe and Suraiya Ahmad
Semantic Engineering: Tetherless World Constellation @ RPI Peter Fox, Stephan Zednik, Patrick West
14The DQSS filters out bad pixels for the user
Default user scenario Search for data Select science team recommendation for quality
screening (filtering) Download screened data
More advanced scenario Search for data Select custom quality screening parameters Download screened data
15
DQSS replaces bad-quality pixels with fill values
Mask based on user criteria(Quality level
< 2)
Good quality data pixels
retained
Output file has the same format and structure as the input file (except for extra mask and original_data fields)
Original data array(Total column precipitable water)
16Visualizations help users see the effect of different quality filters
17DQSS can encode the science team recommendations on
quality screening
AIRS Level 2 Standard Product Use only Best for data assimilation uses Use Best+Good for climatic studies
MODIS Aerosols Use only VeryGood (highest value) over
land Use Marginal+Good+VeryGood over
ocean
18
Initial settings are based on Science Team recommendation.
(Note: “Good” retains retrievals that Good or better).
You can choose settings for all parameters at once...
...or variable by variable
Or, users can select their own criteria...
19
DEMO
http://mirador.gsfc.nasa.gov (Search for ‘AIRX2RET’)
20
DQSS Under the Hood
21
DQSS Flow
user selection
ancillary file
Screener
Quality Ontology
data file
quality mask
screened data file
EndUser
data file w/ mask
Masker
Ontology Query
22DQSS Ontology(The Whole Enchilada)
23
DQSS Ontology (Zoom)
24
DQSS Status
25Significant Accomplishments
Release 1.0 of DQSS for AIRS Level 2 Standard Retrieval Technology Readiness Level (TRL) = 9 (for AIRS L2) Includes Quality Impact views Disclosure of Invention (NF 1679) filed Announced to AIRS Registered Data Product User Community
(~770)
Papers and Presentations Managing Data Quality for Collaborative Science workshop (peer-
reviewed paper) Sounder Science meeting ESDSWG Poster A-Train User Workshop (part of GES DISC presentation) AGU: Ambiguity of Data Quality in Remote Sensing Data
Ontology (v. 2.4) to accommodate both AIRS and MODIS L2
26
Current Status
DQSS is operational for AIRS L2 Standard Products DQSS is offered through the Mirador data
search interface* at the GES DISC
DQSS has been refactored to work in LAADS/MODAPS environment 21/2 month delay due to refactoring and
(successful) audit of NPR 7150.2 compliance by NASA’s Office of the Chief Engineer
Schedule no longer has room for client-side screening
But maybe...(to be continued)* http://mirador.gsfc.nasa.gov
27
Metrics
DQSS is included in web access logs sent to EOSDIS Metrics System (EMS) Tagged with “Protocol” DQSS EMS -> ESDS bridge implemented
Metrics from EMS: Users: 50 Downloads: 9381
28
DQSS Refactoring
GES DISC MODAPS/LAADS
Synchronous vs. Asynchronous
Synchronous Asynchronous batch post-process
Screening machines
Archive/Distribution
Processing minions (no outward connections allowed)
29Refactored DQSS encapsulates ontology and
screening parameters
MODAPS Minions
LAADSWeb GUI
DQSS Ontolog
y Service
data product
screening options
MODAPSDatabase
screeningparameters
(XML)
selected options
screening parameters
(XML)
screening parameters
(XML)
Encapsulation enables reuse in other data centers with diverse environments.
30
DQSS Plans
31
2010 Plans
Adding more products MODIS L2 Water Vapor – 2/2011 MODIS L2 Clouds and Aerosols – 5/2011 Microwave Limb Sounder (MLS) L2 – 6/2011 Ozone Monitoring Instrument (OMI) L2 – 8/2011
Integrate with other services AIRS Variable Subsetting & NetCDF Reformatting – 3/2011 OPeNDAP – 6/2011 MODAPS Web Services – 10/11
Refactor for reusability & sustainability – 9/2011
Release – 11/2011
No longer planned: client-side screening Except...refactored DQSS may make this possible...
32
Combine DQSS with Other Services
AIRS Subsetting, quality screening and reformatting to netCDF are currently mutually exclusive Combining them will also raise awareness of quality
screening DQSS will also be more usable to communities more
comfortable with netCDF (e.g., modeling community)
OPeNDAP OPeNDAP, Inc. is working to support back-end calls to
REST services Gateway is based on ACCESS-05 project (“CEOP
Satellite Data Server”) Will allow access to DQSS via analysis and
visualization clients (e.g., Panoply, IDV) May be reused for other back-end REST services
33
Refactoring for Release
Proposal schedule included refactoring for long-term sustainability A key element of successful technology
infusion
Current refactoring for MODIS environment... Gives us a head start May also give us an avenue to client-side
capabilities
34
New Client Side Concept
Ontology
Web Form
Data Provider
Masker
Screener
dataset-specific
instance infodata files
screening criteria
XML File
XML File
data files
GES DISC End User
35
DQSS Recap
Screening satellite data can be difficult and time consuming for users
The Data Quality Screening System provides an easy-to-use service
The result should be: More attention to quality on users’ part More accurate handling of quality information… …With less user effort
36
Backup Slides
37DQSS Target Uses and Users
RoutineVisual-ization
Quick Recon.
Machine-level Metrics
Interdisciplinary X X
Educational X X
Expert/Power ? X X
Applications X
Algorithm Developers X
38
ESDSWG Activities
Contributions to all 3 TIWG subgroups Semantic Web:
Contributed Use Case for Semantic Web tutorial @ ESIP Participation in ESIP Information Quality Cluster
Services Interoperability and Orchestration Combining ESIP OpenSearch with servicecasting and
datacasting standards Spearheading ESIP Federated OpenSearch cluster
5 servers (2 more soon); 4+ clients Now ESIP Discover cluster (see above)
Processes and Strategies: Developed Use Cases for Decadal Survey Missions:
DESDynI-Lidar, SMAP, and CLARREO
39
Projected TRL Breakdown
Dec-0
6
Jan-
07
Feb-
07
Mar
-07
Apr-0
7
May
-07
Jun-
07
Jul-0
7
Aug-
07
Sep-
07
Oct-0
75
6
7
8
9
AIRS L2MODISOPeNDAP + DQSSClient-Side Screening