13
SURVEY OF COMMONALITY WITH OTHER DISCIPLINES WORKSHOP 2 – JULY 25, 2013 INDIANAPOLIS, INDIANA MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA SERVICES SPECIALIST NCAR LIBRARY / UCAR INTEGRATED INFORMATION SERVICES NATIONAL CENTER FOR ATMOSPHERIC RESEARCH (NCAR) UNIVERSITY CORPORATION FOR ATMOSPHERIC RESEARCH (UCAR) [email protected] PRIMARY RESEARCH OR PRACTICE AREA(S): • DATA CURATION • DATA PUBLICATION & CITATION • METADATA PREVIOUS EXPERIENCE • PH.D. UCLA – INFORMATION STUDIES RELATED WORK (PROJECTS SPECIFIC TO WORKSHOP WITH WEB-SITES) • Data citations within NCAR/UCP (http://dx.doi.org/10.5065/D6ZC80VN) Peer REview for Publication & Accreditation of Research Data in the Earth sciences (PREPARDE, http://www2.le.ac.uk/projects/preparde) CONTACT INFORMATION: P.O. Box 3000 Boulder, CO 80307-3000

MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

SURVEY  OF  COMMONALITY  WITH  OTHER  DISCIPLINES  WORKSHOP 2 – JULY 25, 2013

INDIANAPOLIS, INDIANA

MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA SERVICES SPECIALIST NCAR LIBRARY / UCAR INTEGRATED INFORMATION SERVICES NATIONAL CENTER FOR ATMOSPHERIC RESEARCH (NCAR) UNIVERSITY CORPORATION FOR ATMOSPHERIC RESEARCH (UCAR) [email protected] PRIMARY RESEARCH OR PRACTICE AREA(S): • DATA CURATION • DATA PUBLICATION & CITATION • METADATA PREVIOUS EXPERIENCE • PH.D. UCLA – INFORMATION STUDIES RELATED WORK (PROJECTS SPECIFIC TO WORKSHOP WITH WEB-SITES) • Data citations within NCAR/UCP (http://dx.doi.org/10.5065/D6ZC80VN) • Peer REview for Publication & Accreditation of Research Data in the Earth sciences (PREPARDE, http://www2.le.ac.uk/projects/preparde)

CONTACT INFORMATION: P.O. Box 3000

Boulder, CO 80307-3000

Page 2: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

Geosciences  •  Some  very  robust  and  high  visibility  data  collec6ons  

•  Atmospheric/Oceanic:  NOAA,  NCAR  •  Geophysical:  USGS,  IRIS  •  NASA:  Many  DAACs  •  Interna6onal:  DKRZ  (Germany),  ECMWF  (UK),  NERC  

Environmental  Data  Centres  (UK)  

•  Some  established  and  widely  used  standards  •  Data  –  NetCDF,  HDF,  GRIB,  BUFR,  SEED,  SAC…  •  Metadata  –  ISO  (19139,  19115,  19119),  FGDC,  GCMD…  

•  Most  importantly:  tremendous  diversity!  

 

2  

Page 3: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

NCAR/UCAR  Scope  

3  Image:  copyright  University  Corpora6on  for  Atmospheric  Research    

Page 4: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

NCAR/UCAR  Compu6ng  Facili6es  

Image:  copyright  University  Corpora6on  for  Atmospheric  Research    

Page 5: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

NCAR/UCAR  Data  Services  hWp://www2.ucar.edu/research-­‐resources/data-­‐archive-­‐services  

Page 6: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

1.    Guidelines  –  Near  term  As  a  service,  create  and  maintain  community-­‐developed  data  and  so5ware  management  guidelines.  Ø  Too  many  ad  hoc  systems  Ø  Complete  systems:  standards-­‐based,  sustainable,  cost  

effec6ve    Ø Understand  Principles:  

Ø  Preserva6on  Ø  Data  lifecycle  Ø  Data  and  metadata  standards  Ø  Data  management  planning  Ø  Provenance  Ø  Data  cita6ons  and  DOIs  

 

Page 7: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

2.  Archiving  and  Access  –  Long  term  Create,  adapt  or  iden9fy  an  archiving  and  access  system  for  research  data  and  so5ware  that  need,  but  do  not  currently  have,  sufficient,  secure  and  publicly  accessible  repositories.    Ø  Too  many  orphan  datasets  –  not  in  managed  repositories  Ø   Scholarly  publica6ons  and  data,  righ]ully  so,  are  becoming  

more  6ghtly  linked,  e.g.  DOI’s  Ø  Cost  is  a  major  considera6on  Ø  Flexible  systems;  what  is  the  appropriate  level  of  service?  Ø  Assess  what  we  have,  then  build  or  adapt  to  meet  addi6onal  

needs  Near  term:    quanLtaLve  and  comparaLve  assessment  of  exisLng  systems,  and  their  capacity  for  expansion  

Page 8: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

3.  Discovery  –  Long  term  Create  and  maintain  a  unified  and  flexible  system  for  discovery  of  UCAR  publica9ons,  data,  so5ware,  and  services.    Ø  Past,  did  well  with  centralized  method  for  data  (Community  

Data  Portal),    10-­‐yr  old  effort  Ø Need  a  new  approach:  

Ø  Expand,  more  data,  publica6ons,  more  sofware,  and  services  

Ø  Convert  to  a  distributed  method  Ø  Add  richness  to  the  metadata  standard  

Ø  Sustainable!    Near  term:  Pilot  projects  that  connect  pairs  of  systems  

Page 9: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

4.  PreservaLon  –  Long  term  Develop  appropriate  digital  data  preserva9on  solu9on(s).  Ø  Responsibility  to  preserve  digital  assets  related  to  science  Ø  Easily  overlooked  Ø  Cost  is  not  always  considered  Ø  Need  a  suite  of  approaches  

Ø  Dedicated  UCAR  archives  Ø  Cloud  storage  solu6ons  (at  UCAR  or  commercial)  

Ø  Challenge:  sustaining  management  personnel  and/or  developing  reliable  self  management  tools  

Page 10: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

5.    External  IntegraLon  –  Long  term  Prepare  UCAR  systems  for  greater  integra9on  with  external  distributed  and  federated  systems.    Ø  Federated?    Data  is  mutually  discoverable  and  accessible  

from  mul6ple  data  systems.      Ø  From  a  single  point  a  user  can  reach  into  mul6ple  

systems,  either  interac6vely  (GUI)  or  interoperably  (scripted,  web  service)    

Ø  Future,  standards  are  key,  par6cipa6on  is  a  must  Ø  Challenges  –  many  

Ø  Sharing  storage  access  and  compu6ng  Ø  Authoriza6on,  authen6ca6on  for  users:    security  concerns  Ø Managing  large  numbers  of  datasets  

Near  term:  increase  people  involved  with  developing  federaLons  –  progressively  track  and  report  on  trends,  etc.  

Page 11: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

6.  Make  Data  Open  and  Machine  Readable  the  New  Default  at  UCAR  -­‐  Long  term  

Make  data  accessible  and  machine  readable  by  publishing  an  API(s)  and  providing  a  data  service  to  the  general  public,  entrepreneurs,  policy  &  decision  makers,  and  as  supplementary  data  for  our  scien9sts  in  the  field.    

Ø  Exploit  data  for  mobile  and  web  applica6ons  Ø  Establish  and  maintain  robust  API  Ø  Collaborate,  extensively  with  public  and  private  developers  Ø  Supports  societal  needs,  improves  UCAR  name  recogni6on  

Near  term:  Conferences,  workshops,  or  visitor  programs  which  bring  together  data  providers  and  applicaLon  developers.  

Page 12: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

Conclusions:  

Ø Many  improvements  to  demonstrate  our  leadership  in  data  services!  

Ø  Challenges:  Ø Finding  resources,  largely  human  Ø Segng  priori6es  –  cannot  do  it  all  Ø Developing  the  most  frui]ul  implementa6on  plans  Ø Crea6ng  sustainable  methods  and  procedures  Ø Things  will  change  –  are  we  planning  in  a  flexible/adaptable  manner?  

Ø  These  data  service  issues  are  not  exclusively  at  UCAR,  they  are  in  all  organiza6ons  to  some  degree.  

Page 13: MATTHEW MAYERNIK PROJECT SCIENTIST / RESEARCH DATA

Thanks  

Ø UCAR  Informa6on  Technology  Council,  Data  Services  Working  Group  Ø  Steve  Worley  –  Lead  Ø MaW  Mayernik  –  co-­‐lead  Ø Mike  Wright    Ø  Steve  Williams    Ø  Gary  Strand    Ø  Peter  SchmiW  Ø Marcos  Hermida  Ø  Eric  Nienhouse  Ø  Kelly  Keene  

 Email:  [email protected]