23
Collaboration on Storage Services, June 29th, 2007 National Data Storage (NDS) in the PIONIER *) network Maciej Brzeźniak, Norbert Meyer, Rafał Mikołajczak, Maciej Stroiński *) PIONIER - Polish Optical Internet

Collaboration on Storage Services, June 29th, 2007 National Data Storage (NDS) in the PIONIER *) network Maciej Brzeźniak, Norbert Meyer, Rafał Mikołajczak,

Embed Size (px)

Citation preview

Collaboration on Storage Services, June 29th, 2007

National Data Storage (NDS)

in the PIONIER*) network

Maciej Brzeźniak, Norbert Meyer,

Rafał Mikołajczak, Maciej Stroiński

*) PIONIER - Polish Optical Internet

Collaboration on Storage Services, June 29th, 2007

• Outline:

• Project partners and status

• Goals of the project

• Components, infrastructure (including PIONIER network)

used for building the NDS system

• Main NDS features + Added values of NDS

• Overall NDS architecture

• Example NDS use cases + replication options in NDS

• Potential end-users and applications

• Some words about other storage-related projects

National Data Storage (NDS) in the PIONIER network

Collaboration on Storage Services, June 29th, 2007

4 academic computing centres + 4 universities in Poland:

Academic Computing Center CYFRONET AGH, Cracow

Academic Computing Center in Gdańsk

Częstochowa University of Technology

Marie Curie-Skłodowska University in Lublin

Poznań Supercomputing and Networking Center

Technical University of Białystok

Technical University of Łódź

Wrocław Supercomputing and Networking Center

NDS Project Partners

Collaboration on Storage Services, June 29th, 2007

• Data storage system that:• is distributed, no centralisation!

• has national ‘coverage’,

• is reliable and secure,

• provides broad-band access.

• Services:

• Backup/Archive services

• Application-level data storage:• logical filesystem:

• single logical name space (visible from multiple access points)

• separate logical name spaces

• accessible through:

• SCP, (s)FTP, HTTP(s) protocols

• and other techniques

National Data Storage - Goals

Collaboration on Storage Services, June 29th, 2007

• Existing components:

• Network

• Storage Hardware

• Storage Management Software

• New components:

• NDS System Management Software

National Data Storage components: existing and new

Collaboration on Storage Services, June 29th, 2007

NDS existing components – PIONIER network physical links

Installed fibers

Leased fibers

PIONIER nodes

Planned for 2007

Collaboration on Storage Services, June 29th, 2007

NDS existing components – PIONIER networklogical links

GEANT210+10Gb/s edu traffic5Gb/s Internet

Legend

2x10Gb/s (2 lambdas)

CBDF 2x10 Gb/s (2 lambdas)

1 Gb/s

Metropolitan Area Networks

Metropolitan Area Networks + Supercomputing Centres

Collaboration on Storage Services, June 29th, 2007

NDS components – Hardware and softwareStorage Hardware and Software

• Hardware:

• disk matrices, tape libraries

• starting from 1.2-2 PB (disks+tapes)• 4x 50-200 TB of disks

• 4x 200-400 TB of tapes

• more in future

• Storage Area Networks

• file servers, application servers

• Software:

• Storage Management Systems

• Hierarchical Storage Management

(HSM) systems,

• Backup/Archive systems

Collaboration on Storage Services, June 29th, 2007

Target infrastructure:

- 4 main storage nodes- 4 application nodes- embedded in PIONIER network

Storage nodes:- Provide data storage services- Compose the system core - manage the data objects, file space, user accounts… - control network/hardware/ software compoments

Application nodes:- Provide additional services on top of the core services, e.g: - searching basing on meta-data - versioning, - custom interfaces to data

National Data Storage – main features

NDS

Collaboration on Storage Services, June 29th, 2007

• High level of dependability:

– Data & services avalability:

• Geographical replication - replicas stored in multiple, distant sites

• Hardware/software components redundancy

• + High-end, by-design redundant components

• Backbone network links redundancy

• Fault-tolerance features in the NDS management software

– Decentralisation vs coherency of data and meta-data:

• Coherency kept by NDS management software

• of course challenging… - the ‘core’ of the research work, the rest is mainly the deployment work

National Data Store – Added Value

Collaboration on Storage Services, June 29th, 2007

• High level of dependability (continued)

– Data confidentiality and integrity:

• Encryption:

– Where:

» On the way from the client to the system

» Optionally, before storing the client data into NDS

» Architecture to support both approaches

– How?

» Certified cryptographic solutions (software- and/or hardware-based) used for clients that require them

• Data integrity:

– Ensured by careful system design and security audits

– Evaluated e.g by digest mechanisms: MD5, SHA1…

National Data Store - Added Value

Collaboration on Storage Services, June 29th, 2007

• Both ‘standard’ and ‘custom’ interfaces

– standard:

• B/A service,

• Application-level storage: (s)FTP, SCP…

– custom:

• B/A service with encryption + integrity checks

• application-level storage with encryption + integrity checks

• HTTP/HTTPs interface with meta-data support;

– meta-data can be used later, e.g. for searching files

• Why various interfaces? – in order to:

– allow different users to exploit different features

– meet contradictory requirements… e.g. security vs simplicity

User interfaces

Collaboration on Storage Services, June 29th, 2007

NDS – overall architecture

Collaboration on Storage Services, June 29th, 2007

(0) No replication at all

– Compliant with standards (e.g. industry-accepted B/A clients)

– Data redundancy in the confines of a given node (RAIDs, redundant tape pools)

(1) ‘Off-line’

– Data originally stored into one site, then replicated to another site

– Suitable for standard access methods

– Issues:

• users gets only metadata information concerning replicas created, e.g. by email or on the web-site

• Replication is not atomic with ‘store’ operation

(2) ‘On-line’

– Data replicas are created by the access point in parallel to data storage process

– Assumed number of replicas is created atomically with the ‘store’ operation

– Limitations:

• Suitable for ‘custom’ access methods, incompatible with ‘standard’ ones

• Hard to implement, possible performance delays

Replication options

Collaboration on Storage Services, June 29th, 2007

Example use case – standard B/A clientoff-line replication

Store/retrieve data to/from KMD

Features:

- no system-side data replication- load balancing on per-session basis possible- BUT compliance to standard- NOTE that replication can be done on the client side (manually or automatically)

NDS

Collaboration on Storage Services, June 29th, 2007

Example use case – advanced B/A clienton-line replication

Store/retrieve data to/from KMD

Features:

- on-line data replication!!- dynaminc load- balancing possible- BUT not compatible with standard B/A clients

NDS

Collaboration on Storage Services, June 29th, 2007

• Educational institutions and projects:

– Backup/Archive services for universities

– Cross-centers backup copies/recovery for academic computing centres

– Storage space / file sharing facilities for:

• scientific/educations projects

• national and EU R&D projects

• Government offices and agencies:

– Backup/Archive for government agencies and organisations

• E.g. Police cameras etc., metropolitan CCTV systems, Zoll agencies…

– Secure storage/archival of financial, medical … data

• Such data are confident ‘by definition’

• System certification for such kind of data would be required - out of scope of the project – but this is planned for future

• Other end-users:

– Museums, digital libraries…

– Digitalisation (scanning of eold books, paints…)

Potential end-users of NDS

Collaboration on Storage Services, June 29th, 2007

• User point of view

• Reliable, Secure and Efficient (high performance, broadband access)

• Flexible – many possible interfaces, some other options to choose

• Can be the extra functionallity to the network links

• Service Provider point of view

• scalable system• (cost-) efficient solution, thanks to:

– ‘effect of the scale’» per TB costs are lower for large-scale systems

than for small ones

– using our own network links» No need to pay anyone else for network

– optimal usage of resources:

» HSM systems (i.e. disks + tapes + mgmt) used when possible instead of pure disk-based storage – allows to use economical media types

» network channels reservation on-demand (inst. of persistent links)

Summary – National Data Store

Collaboration on Storage Services, June 29th, 2007

• Currently running storage-related projects:

• CoreGrid (NoE project):

– WP2 (CoreGRID Institute on Knowledge and Data Management),

– Task 2.1: Distributed Storage Management

– Partners: FORTH, Crete, Greece (prof. Angelos Bilas group)and SZTAKI, Hungary, UCY Cyprus (Zsolt Nemeth)

• Already finished projects:

• Secure data storage for Digital Signature System (National R&D project)

– Data acquired from Oracle Database and encrypted BEFORE going into the backup system (on the client side)

– Hardware-based appliance secures the transmission/storage

– Encrypted data put to a regular Backup/Archive system

• Evaluation of the performance of iSCSI and iFCP protocols (published on TERENA conference)

• Automated Backup System – used internally in PSNC

• Planned projects:

– Evaluation of the cluster-based storage approach (e.g. in NDS environment)– Perhaps common EU project with FORTH…

A bit off-topic slide – other storage-related projects in PSNC

Collaboration on Storage Services, June 29th, 2007

Thank YOU!

Contact:

Maciej Brzeźniak, [email protected] Norbert Meyer, [email protected]

Collaboration on Storage Services, June 29th, 2007

‘Backup’ slides

Collaboration on Storage Services, June 29th, 2007

End user example – Police Department in Poznan Backup/Archive service for City Video Monitoring System (CCTV)

Cameras in Poznan:

2004 – 70 cameras 2005 – 85 cameras2006 – 165 cameras2007 – 200 cameras…

2 TB /day60 TB /month

Data must be stored at least for a month for security purposes and are retrievedfor investigations when crime happens.

Tape media are ideal for long term storageso we provide B/A service to police dep.using our B/A systemand tape libraries.

Collaboration on Storage Services, June 29th, 2007

POLICEŁÓDŹ CZESTOCHOWA

POLICEBIAŁYSTOK

NDS storage node in Poznan

Next step – usage of NDS

to provide B/A service for CCTV at the national scale

Temporary storage only

POLICECZESTOCHOWA

Temporary storage only

• Long term storage (archiving)• Backup copies