MPARWG Deborah K Smith DISCOVER MEaSUREs Project Remote Sensing Systems


Citation preview

First Stab at Program Level Quality Metrics


Deborah K SmithDISCOVER MEaSUREs Project

Remote Sensing Systems

ESDSWG Mtg, New Orleans

Dr Frouin’s Criteria Table

Strawman tables in Greg Hunolt’s background notes

Telecon in August with suggestion of answering a series of questions as a means to determine quality. Do the answers to questions lead to a metric of overall quality?

Peter’s presentation this morning on quality feedback from survey and his opinions.

Oct 20-22 2010

Previous Program Level Quality Metrics Dialog

ESDSWG Mtg, New Orleans

A "straw-man proposal", a brainstormed simple proposal intended to generate discussion of its disadvantages and to provoke the generation of new and better proposals. Often, a straw man document will be prepared by one or two people prior to kicking off a larger project. In this way, the team can jump start their discussions with a document that is likely to contain many, but not all the key aspects to be discussed. -Wikipedia

Oct 20-22 2010


ESDSWG Mtg, New Orleans

A project with heritage. We’ve been producing microwave ocean data products and making them available since 1996 under NASA Pathfinder funding.

We have produced, and continue to produce and distribute many ocean products. Each product is assessed for quality before release to the public.

We are currently releasing F16 and F17 SSMIS ocean products. I will use these as an example in this talk.

Oct 20-22 2010


What Questions Do We Answer Before Distributing Data Products?

1. Is the data set complete?

2. Are any gaps confirmed and documented?

3. Are the data acceptably intercalibrated to previous data?

4. Do the data products look as expected? (has a human eye checked the data set?)

5. Are overall statistics within expected range?

6. Are statistics for sub-regions or sub-time frames consistent with expectations and previous data?

7. Are comparison statistics with “truth” or other data (such as buoy, ship or model winds) within expected range?

8. Is the data format consistent with previous data and what users expect?

9. Are files read correctly by read routines, and if not, have changes been made?

10. Have we completed or updated product documentation?

What Questions Do We Answer Before Distributing Data Products?

11. Have we informed the users of file format, processing steps, algorithm changes/specifics?

12. Is a data validation file produced?

13. Has all web and ftp text been updated?

14. Have images been made and do the web tools to display them work correctly?

15. Have we described to users the differences to expect?

16. Is the data product like any other available, and if so, how does it compare?

What Questions Do We Answer Before Distributing Data Products?

17. Who have we created this data set for and will it meet those user needs?

18. Do our tools work on the new data products?

19. Are any new tools needed?

20. What advances have occurred since we last asked these questions and should we change?

What Questions Do We Answer Before Distributing Data Products?

Have we checked the data within the extended time series?

Are there any spurious trends in the data?

Important Climate Questions

ESDSWG Mtg, New Orleans

But what do the answers to these questions mean?

What does the program want to know?

Is an external body needed to determine the quality? If so, who? (program? DAAC? Other scientists? General public?)

Oct 20-22 2010

How to Develop Program Level Quality Metrics?

ESDSWG Mtg, New Orleans Oct 20-22 2010




To what degree have the data been validated?

To what degree do the data fit within already existing products?

To what degree is the data set complete and consistently processed?

To what degree is the data set used in the community?

To what degree are the data accurate and precise?

4 Comparison by both project and other scientists find similar results

Time series analysis demonstrate the quality of intercalibration

Data set complete, with documentation of known gaps, and consistently processed, extends previous values

Some are redistributing the data / Power users

Measurements are both accurate and precise and derived values show expected and accepted ranges

3 Comparisons to known references

Extensive intercalibration performed

Data set complete and consistently processed

Many routine users

Accuracy and precision have been demonstrated by many

2 Comparisons to other data products

Preliminary intercalibration performed

Data set consistently processed but has gaps

Many users Preliminary point comparisons show data are accurate and precise

1 No comparisons made by the project

No intercalibration, but biases are known

Data set has gaps and is not consistently processed

Few users No assessment has been made

ESDSWG Mtg, New Orleans Oct 20-22 2010

Document Quality Level

To what degree is the data format described and usable?

To what degree are the algorithm and processing steps described?

To what degree can a new user begin using the data?

4 Data format is a commonly used standard accessible by 3rd party software

Peer-reviewed publication describes algorithm and data production

Can be using the data within 15 minutes and can confirm the data are accurate

3 Data format is described and read routines provided to users

ATBD provided by data producer

Can use the data accurately within a reasonable amount of time

2 Data format is described

Informal description provided with data

Time required to understand how to use the data

1 No data format information is provided

No description provided

Many hours and support emails needed to make sense of the data

ESDSWG Mtg, New Orleans Oct 20-22 2010

Accessibility/ Support Services


How easy is it for users to read the data ?

How easy is it for users to obtain the data?

To what degree can the user get help?

4 Widely used standard file format

User can access data with clients such as OpenDAP

Full support services available

3 Limited use file format, some common tools can be used

User can view and obtain the data using common tools

User can email the data producer

2 Simple binary format with read routines

User can view the data on the web

Online FAQ

1 Format described but no read routines available

User must download and write own program to access data

No help

ESDSWG Mtg, New Orleans Oct 20-22 2010

Overall Quality Level

To what degree are users satisfied with data product?

What is score from sub-tables?

To what degree is the targeted community using the data**

To what degree is the broader community using the data?

4 Few questions to support services

10-12 Significant use Routinely used by journalists, students, applications

3 Some questions about data

7-10 Routine use Used by journalists

2 Questions about data and format

4-7 Some use Some use by universities and researchers etc

1 Many questions

0-3 No use No use is evident

** this can be obtained from user metrics already being collected and from citation metrics

ESDSWG Mtg, New Orleans

Agreed-upon definitions of all terms

Understanding of how questions carry meaning across projects (if any)

An understanding of what Martha wants

An understanding of the value of this information and how to communicate it

Agreement on what the questions will be

Oct 20-22 2010

What’s Necessary if We are Going to Create Quality Metrics?
