1
A Pla&orm to Provide Interna1onal and InterAgency Support for Data and Informa1on Quality Solu1ons and Best Prac1ces David F. Moroni 1 ([email protected] ); H. K. (Rama) Ramapriyan 2 ([email protected] ); Ge Peng 3 1 Jet Propulsion Laboratory, California Ins1tute of Technology, Pasadena, CA, 2 Science Systems and Applica1ons, Inc., Earth Science Data Informa1on System Project, Goddard Space Flight Center, Greenbelt, MD, 3 Coopera1ve Ins1tute for Climate and Satellites North Carolina (CICSNC), NC State University and NOAA Na1onal Centers for Environmental Informa1on (NCEI), Asheville, NC, USA Overview: The NASA Data Quality Working Group (DQWG) was ini1ated in March 2014 and has con1nued through March 2017 resul1ng in a comple1on of 3 years of ac1vi1es. While the efforts within this working group have been substan1al from NASA’s perspec1ve, others outside of NASA and in the interna1onal arena have expressed desire to par1cipate in related ac1vi1es. This helped to foster the reac1va1on of the Earth Science Informa1on Partners (ESIP) Informa1on Quality Cluster (IQC) in 2014, which con1nues to this day. The ESIP IQC has a broader scope compared to the DQWG, both in terms of context and the ability to provide open membership, which has resulted in a growing collabora1on from interagency and interna1onal par1cipants. During this 1me, the IQC has evaluated new use cases, developed a technical manuscript summarizing its ac1vi1es and plans for future work, and has facilitated the review of use cases and recommenda1ons from the DQWG and other groups. The IQC has formally introduced defini1ons of four aspects of informa1on and data quality: scien1fic, product, stewardship and service. The IQC has defined highlevel roles and responsibili1es of major players including data producers for ensuring and improving data quality and usability. The IQC con1nues to advocate for use case submission and evalua1on as an effec1ve way to capture and beeer understand the needs, challenges and capabili1es of the Earth science data community. Beginning in 2017 and moving into 2018, the DQWG and IQC are working together on a number of ac1vi1es including: iden1fying emerging technologies/prac1ces/solu1ons for informa1on quality, engaging interagency and interna1onal communi1es, more direct feedback from Earth observa1on missions, extrac1ng addi1onal recommenda1ons from new use cases beyond the NASA perspec1ve, and publishing our findings and recommenda1ons in white papers, conferences, and peerreviewed journals. Presented at International Ocean Vector Winds Science Team Meeting on May 2-4, 2017 in San Diego, CA. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology. These activities were carried out across multiple United States government-funded institutions (noted above) under contracts with the National Aeronautics and Space Administration (NASA) and the National Oceanic and Atmospheric Administration (NOAA). Government sponsorship acknowledged. 18.20% 18.20% 9.10% 27.30% 18.20% 9.10% Uncertainty/Error Assessment and Characteriza1on Documenta1on and Metadata Accuracy and Completeness Improving/Defining Standards and Best Prac1ces Improving Data Stewardship Enhancing Usability Enhancing Searchability/Dis1nguishability Figure 2: DQ Interest Survey Primary Interest Area for 20172018 Capture Describe Facilitate Discovery Enable Use Science 9 16 9 5 Product 11 18 10 5 Stewardship 7 11 6 6 Service 5 10 6 5 Figure 4: ESIP IQC Use Case EvaluaHon Summary from 2016. AnnotaHon: 20 use cases evaluated (including 16 legacy DQWG use cases) across DQ Management Phases (columns) and IQ aspects (rows). Some use cases overlap. 12 Prioritized Recommendations 4 Low Hanging Fruit Recommendations Sixteen Use Cases Relevant to the NASA Earth Science Data and Information System 20142015 20152016 Implementation Strategies & Solutions Assessment Report Solutions Master List Data Call Template ACCESS/AIST Engagement 20162017 Use Case Evaluation via Summer ESIP Session Re-Use Readiness Framework Figure 1: DQWG Historical Legacy of Milestones Figure 6: Scope of Mutual Influence and Domain Knowledge Figure 5: IQC AcHviHes for 20152016 OrganizaHons (Capability) PorWolios (Asset Management) Individual Datasets (PracHces) Repository Processes Maturity (e.g., CMMI Data Management Maturity) Repository Procedures Maturity (e.g., ISO 16363:2012–trustworthiness) Asset Management Maturity (e.g., Na1onal Geospa1al Dataset Asset Maturity Model) Stewardship PracHces Maturity (e.g., NCEI/CICSNC Data Stewardship Maturity Matrix) Figure 8: Tiers of ScienHfic Data Stewardship Maturity References: Arndt, D. and M. Brewer, 2016: Assessing service maturity through end user engagement and climate monitoring. ESIP 2016 summer mee1ng. July 19–23, 2016. Durham, NC, USA. Bates, J. J. and J.L. Prive_e, 2012: A maturity model for assessing the completeness of climate data records. EOS, Transac1ons of the AGU, 44, 441. DOI: 10.1029/2012EO440006. Bates, J.J., J.L. Prive_e, E.K. Kearns, W.J. Glance, and X. Zhao, 2015: Sustained Produc1on of Mul1decadal Climate Records: Lessons from the NOAA Climate Data Record Program. DOI: hep://dx.doi.org/10.1175/BAMSD1500015.1 EUMETSAT, 2013: CORECLIMAX Climate Data Record Assessment Instruc1on Manual. Version 2, 25 November 2013. EUMETSAT, 2015a: GAIACLIM Measurement Maturity Matrix Guidance: Gap Analysis for Integrated Atmospheric ECV Climate Monitoring: Report on system of systems approach adopted and ra1onale. Version: 27 Nov 2015. EUMETSAT, 2015b: CORECLIMAX European ECV CDR Capacity Assessment Report. Version: v1, 26 July 2015. Peng, G., J.L. Prive_e, E.J. Kearns, N.A. Ritchey, and S. Ansari, 2015: A unified framework for measuring stewardship prac1ces applied to digital environmental datasets. Data Science Journal, 13. hep://dx.doi.org/10.2481/dsj.14049 Peng, G., J. Lawrimore, V. Toner, C. Lief , R. Baldwin, N. Ritchey, D. Brinegar, and S. A. Delgreco, 2016: Assessing Stewardship Maturity of the Global Historical Climatology NetworkMonthly (GHCNM) Dataset: Use Case Study and Lessons Learned. D. Lib Magazine. 22, doi:10.1045/november2016peng Ramapriyan, H., G. Peng, D. Moroni, and C.L. Shie, 2016: Ensuring and Improving Informa1on Quality for Earth Science Data and Products – Role of the ESIP Informa1on Quality Cluster. SciDataCon 2016, 11 – 13 September 2016, Denver, CO, USA. Zhou, L. H., M. Divakarla, and X. P. Liu, 2016: An Overview of the Joint Polar Satellite System (JPSS) Science Data Product Calibra1on and Valida1on. Remote Sensing, 8(2). doi:10.3390/rs8020139 Data/Product Maturity Matrix EUMETSAT (2013; 2015a) Developed for assessing the capability of measurement and produc1on systems for climate data records of essen1al climate variables Applied to 37 EU data records of essen1al climate variables (EUMETSAT 2015b) Stewardship Maturity Matrix Peng et al. (2015) Developed for assessing maturity of stewardship prac1ces of environmental datasets Applied to over 750 NOAA Earth Science datasets (e.g., Peng et al. 2016) Service Maturity Matrix Arndt and Brewer (2016) Developed for assessing use and service maturity of environmental datasets Underdevelopment by the NOAA/NCEI Service Maturity Matrix Working Group Science Maturity Matrix Bates and Prive_e (2012) Developed for assessing the completeness of satellite climate data record (CDR) datasets Applied to 32 NOAA CDRs (Bates et al. 2015) Figure 7: Dataset LifecycleStagesBased Maturity Assessment Models Mission Statement Discover and assess data quality recommenda1ons and solu1ons in the inter agency and interna7onal arena to improve upon exis1ng technologies, prac1ces, and standards in support of endtoend data lifecycle stewardship in the NASA Earth science domain. Stakeholders NASA HQ ESDIS DAACs SIPSs ACCESS AIST MEaSUREs NASA Earth science instrument teams ESIP Informa1on Quality Cluster (IQC) Approach Conduct volunteer pilot study for “Data Call Template”. Con1nued volunteer use case contribu1on and evalua1on under leadership of the ESIP IQC. Iden1fy new development concepts (e.g., AIST and ACCESS projects) that can be leveraged to facilitate data quality recommenda1ons. Use the "Reuse Readiness Framework" to assess the DQWGendorsed solu1ons provided in the Solu1ons Master List. Publish previous work in ESDISapproved outlets. Outcomes, Deliverables, Milestones Opera1onal readiness assessment of “Data Call Template". An updated “Solu1ons Master List”. Data Quality Sec1on Template for DMP. Publish one or more of the highest priority, and most ac1onable of the previous recommenda1ons/ solu1ons as an ESO document. Publish consolidated recommenda1ons from DQWG annual reports for persistent and public domain access and cita1on. Figure 3: DQWG AcHon Plan for 20172018 Maintain/Preserve/Access Stewardship Use/User Service Service Define/Develop/Validate Science Produce/Evaluate/Obtain Product

APlaormtoProvideInternaonalandInter2AgencySupportforDataan ...€¦ · A"Plaorm"to"Provide"Internaonal"and"Inter2Agency"Supportfor"Dataand"Informaon"Quality"Solu1ons"and"BestPrac1ces""

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: APlaormtoProvideInternaonalandInter2AgencySupportforDataan ...€¦ · A"Plaorm"to"Provide"Internaonal"and"Inter2Agency"Supportfor"Dataand"Informaon"Quality"Solu1ons"and"BestPrac1ces""

A  Pla&orm  to  Provide  Interna1onal  and  Inter-­‐Agency  Support  for  Data  and  Informa1on  Quality  Solu1ons  and  Best  Prac1ces    David  F.  Moroni1  ([email protected]);  H.  K.  (Rama)  Ramapriyan2  ([email protected]);  Ge  Peng3  1Jet  Propulsion  Laboratory,  California  Ins1tute  of  Technology,  Pasadena,  CA,  2Science  Systems  and  Applica1ons,  Inc.,  Earth  Science  Data  Informa1on  System  Project,  Goddard  Space  Flight  Center,  Greenbelt,  MD,    3Coopera1ve  Ins1tute  for  Climate  and  Satellites  -­‐  North  Carolina  (CICS-­‐NC),  NC  State  University  and  NOAA  Na1onal  Centers  for  Environmental  Informa1on  (NCEI),  Asheville,  NC,  USA    

Overview:  The  NASA  Data  Quality  Working  Group  (DQWG)  was  ini1ated  in  March  2014  and  has  con1nued  through  March  2017  resul1ng  in  a  comple1on  of  3  years  of  ac1vi1es.    While  the  efforts  within  this  working  group  have  been  substan1al  from  NASA’s  perspec1ve,  others  outside  of  NASA  and  in  the  interna1onal  arena  have  expressed  desire  to  par1cipate  in  related  ac1vi1es.  This  helped  to  foster  the  reac1va1on  of  the  Earth  Science  Informa1on  Partners  (ESIP)  Informa1on  Quality  Cluster  (IQC)  in  2014,  which  con1nues  to  this  day.  The  ESIP  IQC  has  a  broader  scope  compared  to  the  DQWG,  both  in  terms  of  context  and  the  ability  to  provide  open  membership,  which  has  resulted  in  a  growing  collabora1on  from  inter-­‐agency  and  interna1onal  par1cipants.  During  this  1me,  the  IQC  has  evaluated  new  use  cases,  developed  a  technical  manuscript  summarizing  its  ac1vi1es  and  plans  for  future  work,  and  has  facilitated  the  review  of  use  cases  and  recommenda1ons  from  the  DQWG  and  other  groups.  The  IQC  has  formally  introduced  defini1ons  of  four  aspects  of  informa1on  and  data  quality:  scien1fic,  product,  stewardship  and  service.  The  IQC  has  defined  high-­‐level  roles  and  responsibili1es  of  major  players  including  data  producers  for  ensuring  and  improving  data  quality  and  usability.  The  IQC  con1nues  to  advocate  for  use  case  submission  and  evalua1on  as  an  effec1ve  way  to  capture  and  beeer  understand  the  needs,  challenges  and  capabili1es  of  the  Earth  science  data  community.  Beginning  in  2017  and  moving  into  2018,  the  DQWG  and  IQC  are  working  together  on  a  number  of  ac1vi1es  including:  iden1fying  emerging  technologies/prac1ces/solu1ons  for  informa1on  quality,  engaging  inter-­‐agency  and  interna1onal  communi1es,  more  direct  feedback  from  Earth  observa1on  missions,  extrac1ng  addi1onal  recommenda1ons  from  new  use  cases  beyond  the  NASA  perspec1ve,  and  publishing  our  findings  and  recommenda1ons  in  white  papers,  conferences,  and  peer-­‐reviewed  journals.  

Presented at International Ocean Vector Winds Science Team Meeting on May 2-4, 2017 in San Diego, CA. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology. These activities were carried out across multiple United States government-funded institutions (noted above) under contracts with the National Aeronautics and Space Administration (NASA) and the National Oceanic and Atmospheric Administration (NOAA). Government sponsorship acknowledged.

18.20%  

18.20%  

9.10%  

27.30%  

18.20%  

9.10%  

Uncertainty/Error  Assessment  and  Characteriza1on  

Documenta1on  and  Metadata  Accuracy  and  Completeness  

Improving/Defining  Standards  and  Best  Prac1ces  

Improving  Data  Stewardship  

Enhancing  Usability  

Enhancing  Searchability/Dis1nguishability  

Figure  2:  DQ  Interest  Survey    Primary  Interest  Area  for  2017-­‐2018  

Capture Describe Facilitate  Discovery

Enable  Use

Science 9 16 9 5

Product 11 18 10 5

Stewardship 7 11 6 6

Service 5 10 6 5

Figure  4:  ESIP  IQC  Use  Case  EvaluaHon  Summary  from  2016.    AnnotaHon:  20  use  cases  evaluated  (including  16  legacy  DQWG  use  cases)  across  DQ  Management  Phases  (columns)  and  IQ  aspects  (rows).  

Some  use  cases  overlap.  

12 Prioritized Recommendations

4 Low Hanging Fruit Recommendations

Sixteen Use Cases Relevant to the NASA Earth Science Data and Information System

2014-­‐2015  

2015-­‐2016  Implementation Strategies &

Solutions Assessment Report

Solutions Master List Data Call Template ACCESS/AIST Engagement

2016-­‐2017  Use Case Evaluation via Summer ESIP Session

Re-Use Readiness Framework

Figure  1:  DQWG  Historical  Legacy  of  Milestones    

Figure  6:  Scope  of  Mutual  Influence  and  Domain  Knowledge  

Figure  5:  IQC  AcHviHes  for  2015-­‐2016  

 OrganizaHons  (Capability)  

PorWolios  (Asset  Management)  

Individual  Datasets  (PracHces)  

•  Repository  Processes  Maturity              (e.g.,  CMMI  Data  Management  Maturity)  

•  Repository  Procedures  Maturity              (e.g.,  ISO  16363:2012–trustworthiness)  

•  Asset  Management  Maturity              (e.g.,  Na1onal  Geospa1al  Dataset  Asset  Maturity  Model)  

•  Stewardship  PracHces  Maturity                (e.g.,  NCEI/CICS-­‐NC  Data  Stewardship  Maturity  Matrix)    

Figure  8:  Tiers  of  ScienHfic  Data  Stewardship  Maturity    

References:  Arndt,  D.  and  M.  Brewer,  2016:  Assessing  service  maturity  through  end  user  engagement  and  climate  monitoring.  ESIP  2016  summer  mee1ng.  July  19–23,  2016.  Durham,  NC,  USA.        

Bates,  J.  J.  and  J.L.  Prive_e,  2012:  A  maturity  model  for  assessing  the  completeness  of  climate  data  records.  EOS,  Transac1ons  of  the  AGU,  44,  441.  DOI:  10.1029/2012EO440006.  

Bates,  J.J.,  J.L.  Prive_e,    E.K.  Kearns,  W.J.  Glance,  and  X.  Zhao,  2015:  Sustained  Produc1on  of  Mul1decadal  Climate  Records:  Lessons  from  the  NOAA  Climate  Data  Record  Program.  DOI:  hep://dx.doi.org/10.1175/BAMS-­‐D-­‐15-­‐00015.1  

EUMETSAT,  2013:  CORE-­‐CLIMAX  Climate  Data  Record  Assessment  Instruc1on  Manual.  Version  2,  25  November  2013.  EUMETSAT,  2015a:    GAIA-­‐CLIM  Measurement  Maturity  Matrix  Guidance:  Gap  Analysis  for  Integrated  Atmospheric  ECV  Climate  Monitoring:  Report  on  system  of  systems  approach  adopted  and  ra1onale.  Version:  27  Nov  2015.    

EUMETSAT,  2015b:    CORE-­‐CLIMAX  European  ECV  CDR  Capacity  Assessment  Report.    Version:  v1,  26  July  2015.    Peng,  G.,  J.L.  Prive_e,  E.J.  Kearns,  N.A.  Ritchey,  and  S.  Ansari,  2015:  A  unified  framework  for  measuring  stewardship  prac1ces  applied  to  digital  environmental  datasets.  Data  Science  Journal,  13.  hep://dx.doi.org/10.2481/dsj.14-­‐049  

Peng,  G.,  J.  Lawrimore,  V.  Toner,  C.  Lief  ,  R.  Baldwin,  N.  Ritchey,  D.  Brinegar,  and  S.  A.  Delgreco,  2016:  Assessing  Stewardship  Maturity  of  the  Global  Historical  Climatology  Network-­‐Monthly  (GHCN-­‐M)  Dataset:  Use  Case  Study  and  Lessons  Learned.  D.-­‐Lib  Magazine.  22,  doi:10.1045/november2016-­‐peng  

Ramapriyan,  H.,  G.  Peng,  D.  Moroni,  and  C.-­‐L.  Shie,  2016:  Ensuring  and  Improving  Informa1on  Quality  for  Earth  Science  Data  and  Products  –  Role  of  the  ESIP  Informa1on  Quality  Cluster.  SciDataCon  2016,  11  –  13  September  2016,  Denver,  CO,  USA.  

Zhou,  L.  H.,  M.  Divakarla,  and  X.  P.  Liu,  2016:    An  Overview  of  the  Joint  Polar  Satellite  System  (JPSS)  Science  Data  Product  Calibra1on  and  Valida1on.  Remote  Sensing,  8(2).  doi:10.3390/rs8020139  

Data/Product    Maturity  Matrix  

EUMETSAT  (2013;  2015a)  •  Developed  for  assessing  the  

capability  of  measurement  and  produc1on  systems  for  climate  data  records  of  essen1al  climate  variables  

•  Applied  to  37  EU  data  records  of  essen1al  climate  variables  (EUMETSAT  2015b)  

Stewardship    Maturity  Matrix  

Peng  et  al.  (2015)  •  Developed  for  assessing  maturity  

of  stewardship  prac1ces  of  environmental  datasets  

•  Applied  to  over  750  NOAA  Earth  Science  datasets  (e.g.,  Peng  et  al.  2016)  

Service    Maturity  Matrix  

Arndt  and  Brewer  (2016)  •  Developed  for  assessing  use  

and  service  maturity  of  environmental  datasets  

•  Under-­‐development  by  the  NOAA/NCEI  Service  Maturity  Matrix  Working  Group  

Science    Maturity  Matrix  

Bates  and  Prive_e  (2012)  •  Developed  for  assessing  the  

completeness  of  satellite  climate  data  record  (CDR)  datasets  

•  Applied  to  32  NOAA  CDRs  (Bates  et  al.  2015)  

Figure  7:  Dataset  Lifecycle-­‐Stages-­‐Based  Maturity  Assessment  Models  

Mission Statement Discover  and  assess  data  quality  recommenda1ons  and  solu1ons  in  the  inter-­‐agency  and  interna7onal  arena  to  improve  upon  exis1ng  technologies,  prac1ces,  and  standards  in  support  of  end-­‐to-­‐end  data  lifecycle  stewardship  in  the  NASA  Earth  science  domain.  

Stakeholders •  NASA  HQ  •  ESDIS  •  DAACs  •  SIPSs  •  ACCESS  •  AIST  •  MEaSUREs    •  NASA  Earth  science  instrument  teams  •  ESIP  Informa1on  Quality  Cluster  (IQC)  

Approach •  Conduct  volunteer  pilot  study  for  “Data  Call  Template”.  

•  Con1nued  volunteer  use  case  contribu1on  and  evalua1on  under  leadership  of  the  ESIP  IQC.  

•  Iden1fy  new  development  concepts  (e.g.,  AIST  and  ACCESS  projects)  that  can  be  leveraged  to  facilitate  data  quality  recommenda1ons.  

•  Use  the  "Re-­‐use  Readiness  Framework"  to  assess  the  DQWG-­‐endorsed  solu1ons  provided  in  the  Solu1ons  Master  List.  

•  Publish  previous  work  in  ESDIS-­‐approved  outlets.  

Outcomes, Deliverables, Milestones  •  Opera1onal  readiness  assessment  of  “Data  Call  Template".  

•  An  updated  “Solu1ons  Master  List”.    •  Data  Quality  Sec1on  Template  for  DMP.  •  Publish  one  or  more  of  the  highest  priority,  and  most  ac1onable  of  the  previous  recommenda1ons/solu1ons  as  an  ESO  document.  

•  Publish  consolidated  recommenda1ons  from  DQWG  annual  reports  for  persistent  and  public  domain  access  and  cita1on.  

Figure  3:  DQWG  AcHon  Plan  for  2017-­‐2018  

Maintain/Preserve/Access  Stewardship  

Use/User  Service  Service  

Define/Develop/Validate  Science  

Produce/Evaluate/Obtain  Product