18
Ambiguity of Quality in Ambiguity of Quality in Remote Sensing Data Remote Sensing Data Christopher Lynnes, NASA/GSFC Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by: NASA’s Advancing Collaborative Connections for Earth System Science (ACCESS)

Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Embed Size (px)

Citation preview

Page 1: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Ambiguity of Quality in Ambiguity of Quality in Remote Sensing DataRemote Sensing Data

Christopher Lynnes, NASA/GSFCChristopher Lynnes, NASA/GSFCGreg Leptoukh, NASA/GSFCGreg Leptoukh, NASA/GSFC

Funded by: NASA’s Advancing Collaborative Connections for Earth System Science (ACCESS)

Page 2: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Data Quality GoalData Quality Goal

Describe data quality so that users understand: What it means for the data How to use quality information for data

selection and analysis

Why so difficult? Data Quality depends on intended use Data Quality = Fitness for Purpose

Page 3: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Satellite data: swaths and gridsSatellite data: swaths and grids

Level 2Swath

Level 3Grid

Temporal and Spatial

Aggregation

Page 4: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

A Survey of Data Quality at A Survey of Data Quality at Different Levels of AggregationDifferent Levels of Aggregation

Type of Data

Data Value Data File Dataset

Level 1-2 Swath

•Quality Control flags•Residual errors

Statistical summary of QC info

Aggregate accuracy relative to validation sites

Level 3 Grid

•Cell std. dev.•Number of input values

Statistical summary of QC info

??

Page 5: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Contrasting dataset-level vs. Contrasting dataset-level vs. data-value-level Use Casesdata-value-level Use Cases

Give me just the good- quality data

values

Tell me how good the dataset is

Data Provider

Quality Control

Quality Assessment

aka Quality Assurance

QA

Page 6: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Data-ValueData-ValueQuality ControlQuality Control

Page 7: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

The Data Quality Screening Service: a The Data Quality Screening Service: a straightforward example of data-value quality straightforward example of data-value quality

controlcontrol

Mask based on user criteria(Quality level

< 2)

Good quality data pixels

retained

Output file has the same format and structure as the input file, with fill values replacing the low-quality data

Original data array(Total column precipitable water)

Page 8: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Or, maybe not so Or, maybe not so straightforward...straightforward...

AIRS* Quality Indicators

MODIS** Confidence Flags

0 Best1 Good2 Do Not Use

3 Very Good2 Good1 Marginal0 Bad

?

?

* Atmospheric Infrared Sounder ** Moderate Resolution ImagingSpectroradiometer

Page 9: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Match up by recommendation?Match up by recommendation?

AIRS Quality Indicators

MODIS Aerosols Confidence Flags

0 Best Data Assimilation

1 Good Climatic Studies2 Do Not Use

Use these flags in order to stay within expected error

bounds

Use these flags in order to stay within expected error

bounds

3 Very Good2 Good1 Marginal0 Bad

3 Very Good2 Good1 Marginal0 Bad

Ocean Land

±0.05 ± 0.15 ±0.03 ± 0.10 Ocean Land

Page 10: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

How do quality control indicators How do quality control indicators relate to dataset quality relate to dataset quality

assessment?assessment?AIRS Relative Humidity Comparison against Dropsonde with and without Applying PBest Quality Flag Filtering

Boxed data points indicate AIRS RH data with dry bias > 20%

From a study by Sun Wong (JPL) on specific humidity in the Atlantic Main Development Region for Tropical

Storms

Page 11: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Aggregate QualityAggregate Quality

Page 12: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

The Dubious Meaning of File-The Dubious Meaning of File-Level Quality StatisticsLevel Quality Statistics

Study Area

Percent Cloud Cover?

Page 13: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Level 3 grid cell standard deviation is Level 3 grid cell standard deviation is difficult to interpret due to its dependence difficult to interpret due to its dependence

on magnitudeon magnitude

Page 14: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Neither pixel count nor standard deviation Neither pixel count nor standard deviation fully fully express how representative the grid express how representative the grid

cell value iscell value isMODIS Aerosol Optical Thickness at 0.55 microns

Level 3 Grid AOT Mean

Level 2 Swath

Level 3 Grid AOT Standard Deviation

Level 3 Grid AOT Input Pixel Count

0 1 2 0 10.5

1 122

Page 15: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Potential Solutions to Data Potential Solutions to Data Quality AmbiguityQuality Ambiguity

Page 16: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Solution Part 1: Harmonize Solution Part 1: Harmonize Quality TermsQuality Terms

ISO 19115

Committee on Earth Observation Satellites Quality Assurance for Earth Observations (QA4EO) Working Group on Cal/Val (WGCV)

ESIP Information Quality Cluster

Quality Ontology Data Quality Screening Service: Data-value Quality

Control Aerostat: Level 2 Bias Multi-sensor Data Synergy Advisor: Level 2+3 Bias,

accuracy

Page 17: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Solution Part 2: Address more Solution Part 2: Address more dimensions of Qualitydimensions of Quality

Accuracy: bias + dispersion Accuracy of data with low-quality flags Accuracy of grid cell aggregations

Consistency Temporal: trends, discontinuities Spatial Observing Conditions

Completeness Temporal Spatial: Coverage and Grid Cell

Representativeness Observing conditions

Page 18: Ambiguity of Quality in Remote Sensing Data Christopher Lynnes, NASA/GSFC Greg Leptoukh, NASA/GSFC Funded by : NASA’s Advancing Collaborative Connections

Solution Part 3: Address Fitness Solution Part 3: Address Fitness for Purpose Directlyfor Purpose Directly

Standardize terms of recommendation

Enumerate more positive realms and examples

Enumerate negative examples

For a given quality term or quantity, does it help the user assess fitness for

purpose?