Upload
ceana
View
36
Download
0
Embed Size (px)
DESCRIPTION
Archive Ingest Redesign March 14, 2003. Archive Ingest Redesign high-level requirements. Port Ingest system from Open VMS to Unix Ingest will be the last remaining back-end function on Open VMS. Ingest will run under Solaris on the 15k - PowerPoint PPT Presentation
Citation preview
Implementation Review 1
Archive Ingest RedesignMarch 14, 2003
Implementation Review 2
Archive Ingest Redesign high-level requirements
Port Ingest system from Open VMS to Unix Ingest will be the last remaining back-end function on Open VMS. Ingest will run under Solaris on the 15k
Make Ingest scalable for future increase in data volume post-SM4
Improve throughput and reliability Decouple Ingest from Distribution software for ease of operation and
maintenance Improve system maintainability
Facilitate Ingest changes that are driven by changes in data structure during science instrument lifetimes.
Implementation Review 3
Current OPUS data processing / DADS Ingest interface Historically, data processing and archive systems have developed
independently. Data processing system went from PODPS to OPUS. Archive system went from DMF to ST-DADS. In the past, these systems have not even operated within the same security
environment. This paradigm does not work with the current archive philosophy.
On-the Fly Reprocessing (OTFR) requires integration of data processing and archive distribution functionality.
Enhanced data processing, particularly database catalogs, requires closer coupling of data processing and archive system.
To address this change, software maintenance for data processing and archive systems now in one branch.
Implementation Review 4
Current OPUS data process – DADS Ingest interface (cont.)
PACOR-A
pod file
ArchiveCatalog
ScienceData
Processing
NSA
metadata
HST Science DataReceipt
pod file
uncalibrated FITSdataset
MO diskcalibrated FITSdataset
Calibration
pod file &uncalibratedFITS dataset
Ingest
OPUS
OPUS
DADS
Implementation Review 5
Ingest Functionality
Extract metadata from data header keyword values and populate archive science catalog
Write data files to archive storage media Catalog location and properties of files in archive
database Validate integrity of data files Set proprietary status of data files
Implementation Review 6
Goals of Ingest Redesign project Make Ingest more compatible with current science
instrument design It is almost impossible to enhance the fragile Open VMS DADS
system for new science instruments without breaking existing functionality.
Bring Ingest requirements up to date No longer support GEIS format in archive
Create final archive for HST first generation science instruments No ingest of raw engineering data or subset engineering data
CCS is now HST engineering data archive
Improve operator control of the system
Implementation Review 7
Status of Ingest Redesign project
Ingest Ops Concept complete and distributed on February 20, 2003
Requirement definition in progress
Implementation Review 8
Highlights of Ingest Ops Concept Represents a significant simplification in the data system
architecture Deploy Ingest as a natural extension of data processing
pipelines. Build Ingest on OPUS architecture
OPUS software system has over 7 years of operational experience on HST
Risk mitigated by using a proven architecture Time to deployment will be reduced
Consistent with JWST concept for data processing and archive systems
Same software will be used for both HST and JWST
Implementation Review 9
Highlights of Ingest Ops Concept (cont.)
archive sciencecatalog population
PACOR-A
pod file
ArchiveScienceCatalog
Core SDP NSA
metadata
HST ScienceData Receipt
pod file
uncalibrated FITS dataset
Data depoton EMC
MO disk
Ingest pipeline
calibrated FITS dataset
Calibration
Ingest pipelinepodfile
pod file &uncalibratedFITS dataset
OPUS
OPUS
OPUS
OPUS
DADS
House-keepingCatalog
Implementation Review 10
Highlights of Ingest Ops Concept (cont.) Reduces amount of data shuffling and conversions
between different software systems E.g., current WFPC2 science data processing pipeline
Solaris
Open VMS
Tru 64 Unix
OPUSGeneric
ConverstionCALWP2
FITS2GEIS
StandardFITS
Pass filesfrom OPUS
to DADS
DADSIngest
stwfits
MO disk
StandardFITS VMS
GEIS
VMS GEIS (files not readable on Tru 64)
VMSGEIS
wFITS
Implementation Review 11
Highlights of Ingest Ops Concept (cont.)
Reduces amount of data shuffling and conversions between different software systems (cont.) Future WFPC2 science data processing pipeline
Solaris
OPUSGeneric
ConverstionCALWP2Standard
FITS Ingest Data depoton EMC
StandardFITS
StandardFITS
Implementation Review 12
Benefits of Ops Concept All operations on data handled in a single data
flow. Create FITS file, populate header keyword values,
extract metadata from keyword values, populate science component of archive catalog
No duplication of development effort or functionality Consistent development, testing, and operations helps
insure quality of archive catalog Facilitates easier delivery of header changes
Keyword changes can be built, tested, and deployed within a single subsystem
Implementation Review 13
Benefits of Ops Concept (cont.)
Decouples Ingest and Distribution Software Although both will utilize much of the same hardware
such as the Data depot, 15k, and database
Provides opportunity for consolidation of OPUS and DADS based operator tools
Provides opportunity to automate data validation
Implementation Review 14
Ingest Redesign Schedule Ingest Operational Concept complete and distributed on
February 20, 2003. Requirement specification in progress
To be completed by April 15, 2003
The remainder of the schedule is very preliminary pending requirement scoping and build planning
Design review: June 2003 Phased development in OPUS builds between June 2003 and
March 2004 System tests: March – April 2004 Deploy system: May 2004
Implementation Review 15
Summary of Data Systems software ports to Solaris Over the last few years, HST data processing
systems have been ported from Open VMS to Solaris: OPUS infrastructure
Ported to Unix for FUSE – February 1998 Current version tested under Solaris
HST Science Instrument pipeline applications Ported to Tru64 Unix – October 1999 Testing on Solaris in progress, minor changes anticipated
HST Engineering Data Processing pipelines Ported to Solaris – February 2003
Implementation Review 16
Summary of Data Systems software ports to Solaris (cont.) HST archive systems port from Open VMS to Solaris in
progress: Data Distribution system
completion expected in summer 2003 Archive Ingest system
completion expected in spring 2004
With completion of Archive Ingest System redesign project, all data systems will be running under Solaris.
No other major system enhancement projects expected through end of HST mission.