35
NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

Embed Size (px)

Citation preview

Page 1: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGASThe Next Generation Archive System

Page 2: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

Motivation

Motivation for NGAS:

- Handle huge amount of data streams in real time.

- Reduce operational costs (man-power).

- Decrease expenses in general.

- Provide online and offline processing capabilities.

- Ease integration of archive facility with external clients/applications.

- Provide a common concept for the online archive and the long-term storage facilities (NGAS ≈ OLAS + ASTO + Jukebox SW + more). Note, no plan to replace OLAS for now.

- Simplify and unify the overall infrastructure of the archive system.

- Increase data security.

Page 3: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

Main Objectives

Main Objectives of NGAS:

Provide an archive facility with services for handling all stages in the life-time of data files:

- Archiving files (+ on-the-fly checking and processing).

- Retrieving & on-the-fly processing of files.

- Ensuring data consistency.

- Providing services for managing data.

- (Executing complex, parallel data processing - TBD)

In addition, to provide a system:

- Which is adaptable to specific contexts.

- With a high performance + scalable.

Page 4: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS: History

History of NGAS:

- April 2001: Project started.- Mid June 2001: First operational prototype.- June 2001: Review + approval of design/concept.- Beginning July 2001: Installation/commissioning at La Silla (2.2m/WFI).- Mid July 2001: Entered operation at La Silla.- August 2001: Started operation of Garching NGAS Cluster.- February 2001: Upgrade from Suse to RedHat Linux.- August 2003: Installation/commissioning at Paranal (VLTI).- January 2004: Installation of second archive system for 3.6m/LS.- March 2004: First integration of NGAS on new HW (SATA).- September 2004: First tests using NGAS together with RAID5 Arrays.- September 2004: Archiving of HARPS pipeline products.- December 2004: Archiving of WFCAM frames from Cambridge/UK.

Page 5: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS: Components

Main Components of the NGAS Project:

1. NGAS SW – NG/AMS (Next Generation Archive Management System).

2. NGAS WEB Interfaces.

3. HW – (low cost) PCs with removable ATA disks.

4. NGAS OS (Linux).

5. NGAS Utilities.

6. NGAS Installation and Configuration Tools.

Page 6: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Basic Concepts

Basic Concepts of the NGAS SW (NG/AMS):• NG/AMS is a platform/framework providing basic services.

• No information is hard-coded to support specific types of data – NG/AMS ‘does not know’ what e.g. a FITS file is.

• No information is hard-coded to support specific HW configurations.

• The specific behavior and the specific knowledge has to be added to the NGAS system – customizable.

• Based on standard protocols and formats wherever possible – can be used as a building block.

• Simple - advanced features can be added in front-end applications giving clients a different view of the data + provide specific services.

Page 7: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Main Features/1

Main Features of NG/AMS (1):• Multi-threaded server.

• Standard communication protocol (HTTP) + HTTP Authentication.

• Data file archiving via Push and Pull Techniques.

• Subscription Service including filter mechanism.

• DB synchronization (DB Snapshot Feature).

• Easy adaptation to different kinds of DBMS’ (ANSI SQL Engine/DB Driver).

• Flexible/adaptable due to usage of 10 different kinds of plug-ins.

• Many configurable parameters.

• XML information exchange.

• Email Notification Service.

Page 8: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Main Features/2

Main Features of NG/AMS (2):• Advanced logging service (Verbose, Local Log File, Syslog).

• Background Data Consistency Checking.

• Operation in Cluster Mode.

• Transparent data retrieval & on-the-fly processing.

• APIs in ANSI-C and Python + two clients applications based on these.

• Archive Client for secure and simple, remote data file archiving.

• Many commands to interact with and control the system.

• Portable.

• Unit/Functional Tests.

Page 9: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Server

Data

Provider

Data Provider Host

Data

Requestor

Data Requestor Host

Info

Requestor

Info Requestor Host

NGAS

DB

DBMS Host

Operations UNIX Sys

Logs Log

NG/AMS

Server

Main Disks Array

NGAS Host

Replication Disk Array

Stdout

NG/AMS

Configuration

Archive Pull Request

Data

Subscriber

Client

Data Subscriber Host HTTP POST

Request

NG/AMS

Server

NGAS Subscriber Host

Archive Push Request

HTTP POST Request

Page 10: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Storage Media Infrastructure

Basic Infrastructure of Storage Media:

Page 11: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: XML Information Exchange

Interprocess Data Exchange:- Most information exchanged between NG/AMS Servers and between the NG/AMS Server

and clients, is based on XML.

- Example, NgasDiskInfo Document (NG/AMS Status XML Document):

<?xml version="1.0" ?><NgamsStatus> <Status Date="2003-01-02T08:40:23.350" HostId="acngast1" Message="Disk status file" Version="v2.0-Beta2/2002-12-04T09:22:53"/> <DiskStatus Archive="ESO-ARCHIVE" AvailableMb="32300" BytesStored="8709834319“ Checksum="" Completed="0" CompletionDate="“ DiskId="IC35L040AVER07-0-SXPTX093675" InstallationDate="2002-11-25T09:48:25.000“ LogicalName="FITS-M-000001“ Manufacturer="IBM" NumberOfFiles="163“ TotalDiskWriteTime="905.324898006" Type="MAGNETIC DISK/ATA"/></NgamsStatus>

Page 12: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: HTTP Command Interface

ngasmgr@acngast1:/opsw/NGAS/ngams> telnet acngast1 7777Trying 134.171.21.30...Connected to acngast1.Escape character is '^]'.GET STATUSHTTP/1.0 200 OK<?xml version="1.0" ?><!DOCTYPE NgamsStatus SYSTEM "http://acngast1.hq.eso.org:7777/RETRIEVE?internal=ngamsStatus.dtd"><NgamsStatus> <Status Date="2002-12-23T14:59:42.724" HostId="acngast1" Message="Successfully handled command STATUS" State="ONLINE" Status="SUCCESS" SubState="IDLE" Version="v2.0-Beta2/2002-12-04T09:22:53"/></NgamsStatus>Connection closed by foreign host.

Page 13: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: DB Synchronization

DB Synchronization:• NGAS DBs replicated from Paranal/La Silla to Garching (Unidirectional).

• Synchronization between DBs of the various NGAS sites also carried out by NGAS.

• NG/AMS maintains snapshot (DBM) on the disks with info about the files stored on it.

• Local DB synchronized with this info when the disk reappears on a site.

• DB Snapshot can be used as a table of contents for the disk.

LS

NGAS

DB

La Silla

PAR

NGAS

DB

Paranal

PAR

NGAS

DB

Garching

DB

Snapshot

NG/AMS DB Synchronization

DBMS Synchronization (Sybase)

Page 14: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Plug-Ins

NG/AMS Plug-Ins:-Ten different kinds of plug-ins provided. These make it possible to adapt the system to different kinds of hardware and different types of data – nothing is hard-coded:

1. Online Plug-In.

2. Offline Plug-In.

3. Data Archiving Plug-In.

4. Checksum Plug-In.

5. Data Processing Plug-In.

6. Registration Plug-In.

7. Label Printer Plug-In.

8. Filter Plug-In.

9. Suspension Plug-In.

10. Wake-Up Plug-In.

-Standard plug-ins delivered with the system. Possible to replace these or add new plug-ins when needed.

-The plug-ins delivered with a distribution of NGAS should be viewed as belonging to the core of the system when it comes to testing.-Normal user does not need to know about the plug-ins used.

Page 15: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Plug-Ins

Data Archiving Plug-In – Basic Functioning:

Replication Disk

Storage

Area

Staging

Area

Main Disk

Bad Files

AreaStorage

Area

NgasDiskInfo

Target Storage Set

NG/AMS

Server

DAPI

Data File

NGAS

DB

1. Archive Request

2. Reception in

Staging Area

3. DAPI Invocation

4. Data Checking/Processing,

Parameter Extraction

5. DAPI Return Status

6. Storage of Main

File in Final Location

7. DB Update,

Main File

8. Replication of File

9. DB Update,

Replication File

Page 16: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: XML Configuration

NG/AMS Configuration (1):• About 110 different configurable parameters.• Configuration can be loaded from an XML document or from the DB

or a combination of these.• Possible to re-use DB based parameters to compose specific

configurations (easier to handle many, slightly different installations). • Main groups of configurable parameters (1):

- Basic Parameters: Port number, simulation mode, proxy mode, root mount point, …

- Plug-Ins: The various plug-ins the system should use e.g. to handle data of a specific type.

- DB Connection: The DB connection parameters.- Permissions: Archive, Retrieve, Processing, Remove Requests allowed.- Archive Handling Parameters: Parameters for handling Archive Requests.- Accepted Data Types: Types of data (mime-types) the system is can handle.

Page 17: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: XML Configuration

NG/AMS Configuration (2):• Main groups of configurable parameters (2):

- Storage Sets: The disk configuration.

- Streams: Defines how the different kind of data should be streamed onto the Storage Sets.

- Available Processing Capabilities: Defines the types of data that can be processed and which Data Processing Plug-Ins to use.

- Data Check/Janitor Thread Configuration: Parameters to tune the Data Checking and Janitor Threads.

- Logging Parameters: E.g. name of log files + intensity to apply when logging.

- Email Notification Parameters: Recipients of the various types of Email Notification Messages.

- Host Suspension Parameters: Parameters for suspending a host + for waking up suspended hosts.

- Subscription Parameters: Parameters to define if a server should subscribe for data.

- Authorization Parameters: Defines the known users and their access code.

Page 18: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Data Consistency Checking

Data Consistency Checking:- Necessary constantly to monitor the condition of the data in the archive.

- Data Consistency Checking – Thread running in background.

- Possible to tune the amount of resources occupied by the service.

- A check run can be scheduled to run periodically via the configuration.

- Checksum check, file availability, unregistered files on storage media.

- A check sub-thread is started per disk (max. number configurable).

- Info about files on the system dumped once in a DBM, retrieved file by file during checking.

- Possible to resume a checking from where the previous was interrupted.

- Email Notification send to subscribers in case problems found, e.g.:

Subject: NGAS-arcus2-7778: DATA INCONSISTENCY(IES) FOUNDDate: Fri, 25 Jan 2002 01:06:26 +0100 (MET)From: [email protected] Message:DATA INCONSISTENY(IES) FOUND IN DATA HOLDING:Date: 2002-02-12T15:32:05.424NGAS Host: arcus2Inconsistencies: 1Problem Description File ID Version ----------------------------------------------------------------------------------ERROR: Inconsistent checksum found TEST.2001-05-08T15:25:00.123 3 ----------------------------------------------------------------------------------

Page 19: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Operation in Cluster Mode/1

Example:NGAS

Super Node(Proxy Mode)

NGASSuper Node

(Proxy Mode)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASMain Node 1

NGASMain Node 1

NetworkSwitch

NetworkSwitch

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASMain Node 2

NGASMain Node 2

NetworkSwitch

NetworkSwitch

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASSub-Node

(10.X.X.X)

NGASMain Node 3

NGASMain Node 3

NetworkSwitch

NetworkSwitch

NetworkSwitch

NetworkSwitch

Retrieve RequestPrivate Network

Cluster Back-Bone Network

25

34

1

6

Page 20: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Operation in Cluster Mode/2

Example:

NGAS

Main Node

NGAS

Main Node

Network

Switch

Network

Switch

Retrieve Request

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node NGAS

Node

NGAS

Node

NGAS

Node

NGAS

Node

2

1

34

NGAS

Node

NGAS

Node

Page 21: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

Garching NGAS Cluster

NGAS Cluster

Page 22: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Data Processing

Data Processing at Retrieval:

• Simple processing supported when retrieving files.

• Possible to request the system to apply a Processing Plug-In on the data and to send back the result of the plug-in rather than the data itself.

• Processing performed on the sub-node hosting the data.

• Possible for clients to use the NGAS Cluster as a ‘number cruncher’ to carry out parallel data processing in a simple manner.

• Reduces the amount of data to be transferred to the client. I.e., a floating point number may be returned rather than the entire data file.

• Can be extended by providing new Data Processing Plug-Ins for specific contexts.

• Could be used to integrate NGAS with the AVO or other archive services.

Page 23: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: APIs

NG/AMS APIs + Clients:• Two APIs implemented in C (C library) and Python (class) provided.• Facilitates implementation of client applications communicating with NGAS, e.g. to

retrieve data files.• Two command line utilities are provided, based on the C and Python API, which

can be used to interact with an NG/AMS Server.• A standalone Archive Client is provided, based on the C-API:

— Independent of any DBMS.— Can be used to archive files from any remote host which can access the NGAS Archive

via HTTP.— Attempts to archive file is retried until success is returned or file classified as bad by the

remote NGAS system.— Files not cleaned up before cross-checking that they are really in the remote NGAS

Archive (CHECKFILE Command).— First applications: Archiving of HARPS pipeline products and WFCAM files from

Cambridge/UK.

Page 24: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS Client Applications

NG/AMS Archive Client

NG/AMS

Server

Remote NGAS System

NG/AMS

Archive

Client

Data Provider Host

Archive Queue

Archived Files Area

Bad Files Area

Log Files Area

BAD

Log Info

Log Rotation Control

Archive Requests + Commands NGAS

DB

Page 25: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NG/AMS: Server Commands

NG/AMS Server Commands (HTTP Protocol):- Commands issued as URLs: http://<Host>:<Port>/<Command>[?<Par=Val>[&<Par=Val>]]

- Commands:• ARCHIVE: Archive data with Archive Push or Archive Pull Technique.• CHECKFILE: Execute an explicit file check of the given file.• CLONE: Clone an entire disk or individual files.• CONFIG: Configure an online system.• DISCARD: Force removal of file from disk and/or DB independent of number of copies.• EXIT: Make the NG/AMS Server exit.• INIT: Re-initialize the NG/AMS Server.• LABEL: Print out disk labels.• OFFLINE: Bring server to Offline State.• ONLINE: Bring server Online.• REGISTER: Register a file of a set of file already stored on an ‘NGAS Disk’.• REMDISK: Remove a disk from the archive (only allowed if at least 3 copies of each files available).• REMFILE: Remove a file from the archive.• RETRIEVE: Retrieve a file, transparently, from the archive.• STATUS: Query status about the server or another component in the NGAS system/cluster.• SUBSCRIBE: Subscribe to new data or a set of data.• UNSUBSCRIBE:Unsubscribe a previously created subscription.

Page 26: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

Unit/Functional Tests - Features

Unit/Functional Tests:- Extensive set of automatic tests provided, consisting of:

- 30 Test Suites.

- ~130 Test Cases.

- Tests portable (platform/HW independent).

- Testing the business logic of the system and correct functioning (simulation mode).

- Need to add more Test Cases for testing correct and consistent behavior under abnormal conditions and stress tests.

- Needs to be enhanced with ~200 Test Cases before next release.

- Possible to generate Test Plan from test code (next slide - overhaul ongoing).

Page 27: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

Unit/Functional Tests - Test Plan

Example:

Page 28: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS WEB Interfaces

NGAS WEB Interfaces:• WEB Interfaces provided to assist operators in querying the status of the system and to

search for various components (data files, disks, machines).

• Used at all sites by the operators (Garching, Paranal, La Silla).

• Based on Zope. WEB management system providing editing via WEB browser (http://www.zope.org).

• Local Zope WEB Servers available on each site.

• Tools provided to list disks, find specific files get an overview of the nodes and their status.

• Also the so-called Operator’s Log Book is provided. The operators use this to log all actions carried out.

• Used by the operators at Paranal/La Silla to monitor the online archiving activities.

• Services missing for interacting with the system. Only possible to control the disk label printing for now.

• An enhancement is planned in the near future.

Page 29: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS System/OS

NGAS OS Distribution:- Started on a Suse Linux distribution and migrated to RedHat Linux (ESO

standardization).- OS distribution prepared/managed by OTS-SOS.- Support for single-processor and multi-processor configurations.- Support for old HW (PATA) and new HW (SATA).- Limited installation, many packages removed to reduce the size of system.- Special packages needed by NGAS: Python, Sybase interface, Zope, … -

installed by the NGAS Installation Tool.- Special driver SW needed for the 3ware controller.- Zope WEB server running on some nodes (optional).- 3ware disk controller WEB server running on every host.- Possibility to back-up/restore complete system by means of the Mondo/Mindi

tool kit (from a single CDROM) in 10 minutes.- From July 2004 NGAS OS platform installed with kickstart installation script.

Page 30: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS HW

NGAS HW (1):- Started with 8 slots parallel ATA systems.- 8 x 80 GB storage capacity per node (640 GB/node, ~1.2 TB compressed).- Since March 2004 a 24 slot serial ATA system in operation (up to 24 * 400 GB

= 9.6 TB/node, 19.2 TB compressed).- Reduces price per GB.- More robust HW amongst other due to serial ATA (cleaner cabling).- Disk handling easier, more robust disk frames.- Overall HW stability (hopefully) better and less intervention needed (TBC).- Amount of data/CPU should be balanced to be able to process the data in a

limited time.- TBD when to use new HW in operation at observatory sites.- Investigating usage of RAID5 rather then JBOD disks.

Page 31: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS HW

NGAS HW (2):

Page 32: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS Utilities

NGAS Operator’s Utilities/Installation Utilities:- Small module provided (NGAS Utilities) with utilities for the daily work of the

operators:- Limited time invested in this so far, however essential tools for the operation provided

(e.g. Clone Verification Tool, Check File List Tool, Clone File List Tool, …).

- The function of many of these tools should be taken over by the NGAS WEB Interfaces when these have been enhanced.

- The module NGAS Installation Tools provides some utilities to install and check the system:

- Tool provided to build ‘NGAS layer’ on top of the ‘basic’ NGAS Linux distribution.

- Functionality still to be implemented.

Page 33: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS Infrastructure

Present ESO NGAS Infrastructure:

NGAS

DB NGAS

DB

NGAS

DB

Replication Archive

Disk Sets

Archive Unit

Buffering Unit

Archive Handling Unit

Cluster Unit

Ext.

Archive

ClientExt.

Archive

Client

LS

PAR

GAR

INS

Page 34: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

NGAS: Future Plans

(Near) Future Plans for NGAS:• Received detailed requirements from archive operations.• Enhance NGAS WEB Management Interfaces.• Enhancement of services for operation in cluster (extended proxy mode).• Enhancement of installation utilities.• Enhancement of unit tests (simulation of archive cluster operation).• Implement load balancing/archive cluster operation for high availability/high

data rates (VST/ΩCam: up to 300 GB/night, VISTA/VistaCAM up to 1 TB/night - TBC).

• Support for advanced data processing, utilizing an NGAS Cluster as a parallel processing engine (specify complex recipes, which are executing parallel data processing) – will be analyzed in the near future.

• Support for the Astrophysical Virtual Observatory/GRID?

Page 35: NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System

Jens Knudstrup

Status - December 2004

Status of NGAS Project December 2004:- In operation since July 2001.

- Used heavily on a daily basis by archive operators in Garching.

- Data archived daily at La Silla, Paranal and at ESO HQ.

- Data archived directly into NGAS Archive in Garching from Paranal and Cambridge/WFCAM.

- Some statistics:- Total number of nodes: ~25.

- Total number of disks in use: ~260.

- Total number of files in NGAS Archive: ~1,500,000.

- Amount of compressed data in NGAS Archive: ~27 TB.

- Amount of uncompressed data in NGAS Archive: ~45 TB.

- Maximum throughput per node (archiving): ~400 GB/24 hours (including compression).

- Major Issues to Address:- Need to invest more resources in implementing automatic tests in particular for testing robustness and handling of

abnormal conditions.

- Need to implement resources in implement an enhanced user interface - not very user-friendly at the moment.

- Need to update the design document to reflect present status of system (not updated since it was written SPRING 2001).

- Should investigate improved ways of ensuring data consistency and means for recovering lost data.