User Working Group 2013 Data Management System Status 12 March
2013 http://podaac.jpl.nasa.gov
Slide 2
Data Management & Stewardship Preserve NASAs data for the
benefit of future generations Data Access Provide intuitive
services to discover, select, extract and utilize data Science
Information Services Provide a knowledgebase to help a broad user
community understand and interpret satellite ocean data and related
information PO.DAAC Functional Areas
Slide 3
Data Access Provide intuitive services to discover, select,
extract and utilize data Science Information Services Provide a
knowledgebase to help a broad user community understand and
interpret satellite ocean data and related information PO.DAAC
Functional Areas Data Management & Stewardship Preserve NASAs
data for the benefit of future generations
Slide 4
High-Level Functions Consumers Archive Ingest Web Services
& Publishing Direct Data Access High-Level Access Tools Web
Portal Visualization Inventory Information Providers Data In
Information In Data Out Information Out Data Providers Data
Providers
Slide 5
High-Level Functions Consumers Archive Ingest Web Services
& Publishing Direct Data Access High-Level Access Tools Web
Portal Visualization Inventory Information Providers Data In
Information In Data Out Information Out Data Providers Data
Providers
Slide 6
Recommendation 4 Annual dataset gap analysis and prioritization
Closed. (Just noting that the DGAP adoption process is referenced
in the DSLP.) Recommendation 6 Create public webpage that documents
PO.DAACs best practices: Open. (The Dataset Lifecycle Policy
captures PO.DAAC best practices, but we have not yet gone public.)
Recommendation 7 Creation of a dataset lifecycle policy: Closed.
(Done, being applied to every new dataset, and refining / improving
it as we uncover lessons-learned.) Recommendation 8 Work with
GHRSST on metadata practices: Closed. (Just noting that GHRSST is
aware of our Dataset Lifecycle Policy, and is incorporating its
basic constructs into their own approach.) Applicable 2012 UWG
Recommendations
Slide 7
Purpose of this Presentation: Status the Data Management
System: Business processes, and IT infrastructure Presentation
Outline: Data Management System = BP + IT Catching up IT
Infrastructure Status Business Process Status Outline
Slide 8
Business Processes Includes policies, process descriptions,
templates, procedures, etc. Examples: Overarching Dataset Lifecycle
Policy Memorandum of Understanding template Data Management Plan
template Database Audit procedure File Audit procedure Data
Acceptance Policy Data Dictionary management process Dataset Types
definitions Remote Dataset Policy/Approach BP + IT IT
Infrastructure Includes hardware, software, networks, interfaces,
etc. Examples: Site Crawler capability Data Handler capability Data
Reader capability Data Dictionary implementation Data Catalog
capability Data Archive capability Server Infrastructure Network
Infrastructure Storage Infrastructure PO.DAAC Data Management
System Business Processes IT Infrastructur e =+
Slide 9
System Integration: Business Process and IT Services 2009 2010
2011 2012 2013 2014 2015 BPBP ITIT BPBP ITIT BPBP ITIT BPBP ITIT
BPBP ITIT BPBP ITIT Maturity
Slide 10
Already put significant effort into IT Solid baseline
established in the first evolution Major pieces are in place and
working smoothly That stability provides opportunities Reaping the
benefits of a solid foundation Infrastructure System
Deliveries
Slide 11
IT is stabilized, so Turning attention to business practices /
processes Began with the Dataset Lifecycle Policy Working on
tallying a list of all needed processes Business Practices
Slide 12
SDLC RemoteA&C 1 DB Audit DSLP 1 DSLP 2 Templates
Dictionary UseCase A&C 2 Types DOIs Data Management System
Business Processes (BPs) Overarching Dataset Lifecycle
PolicyMemorandum of Understanding (template) Use Case / User Story
ProcessDataset Gap Analysis and Prioritization DOI
ProcessIntegrated Schedule Development Handling Remote datasetsData
Management Plan (template) Data Dictionary ManagementInterface
Control Document (template) Data Types (definition and
flow)Operational Readiness Checklist (template) Database
AuditingUser Guide (template) Assessment &
CharacterizationSystem Impact Assessment (template) File
AuditingReadMe (template) Standard Documentation ProcessRetirement
Process and Plan template File Audit 2012 2013 2015 2014
Slide 13
SDLC RemoteA&C 1 DB Audit DSLP 1 DSLP 2 Templates
Dictionary UseCase A&C 2 Types DOIs Note: Green = Defined
processes. Orange = processes being actively worked. Data
Management System Business Processes (BPs) Overarching Dataset
Lifecycle PolicyMemorandum of Understanding (template) Use Case /
User Story ProcessDataset Gap Analysis and Prioritization DOI
ProcessIntegrated Schedule Development Handling Remote datasetsData
Management Plan (template) Data Dictionary ManagementInterface
Control Document (template) Data Types (definition and
flow)Operational Readiness Checklist (template) Database
AuditingUser Guide (template) Assessment &
CharacterizationSystem Impact Assessment (template) File
AuditingReadMe (template) Standard Documentation ProcessRetirement
Process and Plan template File Audit 2012 2013 2015 2014
Slide 14
Dataset Lifecycle Policy is written and active Worked via
iterative discussions Document driven Consistent approach + best
practices Follow the template, follow the policy. A living document
BP: Dataset Lifecycle Policy Best Practices Documents Templates
assigned to controlled by
Slide 15
Lifecycle Phases and Documents Dataset Identification Prepare
the System Integration Operations Retirement On New Version On
Deprecation Dataset Approval DGA P ISSIAMOUICDDMPORCUGRP Draft 4 4
7 7 8 8 6 6
Slide 16
Next BP were tackling is data types Came up in discussions of
DSLP Clear we have a disconnect in both definition and flow BP:
Data Types
Slide 17
Data Types Main Types* DefinitionSearchable?Access PreviewGive
the public a sneak preview. Datasets should not be in this category
beyond 18 months. WARNING file should be posted. YesAnonymous
OpenFully supported datasets with unrestricted community
access.YesAnonymous RetiredDatasets that have been superseded, are
obsolete or have an error. Per UWG recommendation, these data sets
are retrievable but come with a health warning. Varies by
datasetAnonymous ReducedReduced or value-added data sets created by
a PO.DAAC tool.Varies by datasetAnonymous SharedTemporary space for
sharing non-operational datasets between science team members
(e.g., vetting datasets). NoPassword ControlledITAR or mission
sensitive data sets.YesPassword *Other miscellaneous types exist,
for example: Dormant and Simulated
Slide 18
Current Type Progression Mission Datasets Shared Open Retired
Controlled Community Datasets Preview Open Retired
Slide 19
Types and Lifecycle Policy not aligned Some types set
visibility, others indicate progress Flow / progression doesnt
align Recognize theres a problem, so working it Beginning with
Remote Data Types