38
© 2008 DataDirect Networks, Inc. All Rights Reserved. © 2009 DataDirect Networks, Inc. All Rights Reserved. Summit ’09 October 15, 2009 Dave Fellinger CTO The Efficient Use of Cyberinfrastructure to Enable Data Analysis Collaboration

The Efficient Use of Cyberinfrastructure to Enable Data Analysis Collaboration

Embed Size (px)

Citation preview

© 2008 DataDirect Networks, Inc. All Rights Reserved.

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Summit ’09 October 15, 2009Dave Fellinger

CTO

The Efficient Use of Cyberinfrastructure

to Enable Data Analysis Collaboration

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Company Data at a Glance

• Fast Growing Data Infrastructure Provider for Companies who

Demand Extreme Performance for their Large Content Files and

Unstructured Data

• Integrated Portfolio of Extreme Storage Platforms, Intuitive Storage Management Software and Consulting Services

• Over 10 Years of Stability and Experience

• Over $130M in Annual Revenue

• Growing, Profitable and Hiring

• Over 200 Petabytes Installed Worldwide

• Clients include XBOX LIVE, Slide & Saudi Aramco

• Global Partners include IBM, Sony and Dell

• Expanding Globally, with Established Offices

• Europe, India, Australia, Asia Pacific and Japan

“DDN could find itself a market leading provider for the Internet computing era

in the same way that EMC did in the transactional era and NetApp did in the

distributed era”

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Storage Designed for the Most

Extreme Environments in the World

• Sites That Have Zero Tolerance For Data Latency

• Sites That Demand Sub-Millisecond Application Response Times

• Sites that Stream Extreme HD Content on a Massive Scale

• Sites with Our Systems Delivering Over 100 Gigabytes Per Second of

Consistent Throughput

• Sites that Research International Scientific Data

• The Most Sophisticated and Demanding High Performance Computing

Environments

• Powering 8 of the Top 10 and

• Powering 50 of the Top 100 Supercomputer Sites

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Data Distribution Evolution

• The Problem

• A large number of sites need simultaneous

access to local data for detailed analysis.

• The Solution

• Utilize a simplified object based file system to provide simultaneous synchronous instances of data based on policy.

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Data Distribution Today

• The “Library Method”

• Large data warehouses are queried for data in

an iterative manner.

• MODIS as an example.

• All MODIS image data is stored at NASA Goddard on both disk and tape.

• Inquiries generally spawn additional inquires

requiring time, study, and network bandwidth.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Disk

RAID

Local File System Local File System

NFS/HTTP/CIFS

NFS/HTTP/CIFS

NFS/HTTP/CIFS

NFS/HTTP/CIFS

Local File System Local File System

Disk

RAID

Local File System Local File System

NFS/HTTP/CIFS

NFS/HTTP/CIFS

NFS/HTTP/CIFS

NFS/HTTP/CIFS

Local File System Local File System

Disk

RAID

Local File System Local File System

NFS/HTTP/CIFS

NFS/HTTP/CIFS

NFS/HTTP/CIFS

NFS/HTTP/CIFS

Local File System Local File System

How is it Done Today?

Hundreds or thousands of building

blocks like this

No common management framework

Huge investment in custom engineering

to cobble together a global namespace

Heavy reliance on expensive CDNs

Internally developed replication schemes

LAN

URL explosion in objects

Map Objects to individual file systems

Within file system files and folders are

named 1, 2, 3, 4…. for hashing and rapid

index lookups

Must manage # of files per folder, # of

folders in each file system

Multiple IOPS to get a file limits # of file

reads per second and increases latency

Database

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Multi-site Content Delivery

SITE A

SITE E

SITE C

SITE B

SITE D

Replication Software (x10)

Administrator (x5)

Storage/File System(s)

RAID/LUNs/Fibre Channel (x10 – Thousands)

Database

Tracking

File

Locations

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

What’s within each site?

Complexity!

Hundreds of filers

Multiple File

Servers

Multiple

RAIDs

Fibre Channel

Switching,

Cables, SFPs

Dozens – Hundreds of File Systems!

• Lots of things to manage

• No automation or coordination

• Must maintain a path to find every file

• Provisioning more capacity increases complexity

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

DDN’s Cloud Storage Initiative

9

!

Take a clean look at the issues of massively scalable

file storage & distribution

!

Understand best practices from leading edge

customers

!

Develop a tailored solution

– Work with key users every step of the way

– Eliminate the need for complex multi-vendor integration

– Minimize customers having to write custom code

– Focus on file reads per second rather than IOPs

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Scalable File Delivery

with WOS Clouds

Administrator

WOS Nodes

Database

Storing WOS

Object IDs

Network

Connection

• Single Management Point

• Single Namespace for Billions of Files

• Fully Automated, Load Balancing & Self Healing

• Automated Best-Path File Retrieval

• Multi-site Policy-Based Replication

• Add Capacity Online in Seconds

• Easy!

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 1

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-B WOS-LIB WOS-LIB

UsersUsers

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 2

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

User uploads a file to application server.

User uploads a file to application server.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 3

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

Application makes a call to the WOS- LIB to store (PUT) a new object

Application makes a call to the WOS- LIB to store (PUT) a new object

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 4

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

WOS-LIB stores the object on a node. Subsequent objects are automatically load balanced across the cluster.

WOS-LIB stores the object on a node. Subsequent objects are automatically load balanced across the cluster.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 5

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

If the storage policy specifies replication, the object is replicated to another node.

If the storage policy specifies replication, the object is replicated to another node.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 6

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

OID = 5718a36143521602

WOS-LIB returns a unique Object ID which the application stores in lieu of a file path.

WOS-LIB returns a unique Object ID which the application stores in lieu of a file path.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 7

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

A User needs the stored file.

A User needs the stored file.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 8

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

Application makes a call to WOS-LIB to read (GET) the object. The unique Object ID is passed to WOS-LIB.

Application makes a call to WOS-LIB to read (GET) the object. The unique Object ID is passed to WOS-LIB.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 9

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

OID = 5718a36143521602

WOS-LIB automatically determines what node(s) have the requested object, retrieves the object, and returns it to the application.

WOS-LIB automatically determines what node(s) have the requested object, retrieves the object, and returns it to the application.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 10

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

OID = 5718a36143521602

WOS-LIB automatically determines what node(s) have the requested object, retrieves the object, and returns it to the application.

WOS-LIB automatically determines what node(s) have the requested object, retrieves the object, and returns it to the application.

OID = 5718a36143521602

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Puts & Gets - 11

WOS Cluster

Gigabit

Ethernet

Network

Customer

Application

ServersWOS-LIB WOS-LIB WOS-LIB

UsersUsers

Application returns file to user.

Application returns file to user.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

Zones are logical groups of WOS nodes in which you scope replication, or data protection policies.

In this example, we have two zones which also map to geographic data centers.

These data centers both serve the public internet.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 2

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

Objects stored utilizing the “Wild” policy will result in objects being stored on two distinct nodes within the “West” zone.

The server performing the store action can be anywhere on the network, including the “East” zone

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 3

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

Objects stored utilizing the “Safe” policy will result in object replicas being stored on one node in the “West” zone and one node in the “East” zone.

Again, the server performing the store action via the WOS-LIB can be anywhere on the network.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 4

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

The “West” and “East” zones utilize replication on high performance WOS nodes for the most demanding service environments.

What if you wanted an extra object copy for Disaster Recovery?

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 5

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

CYA: West=1, East=1, DR=1

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

Zone “DR”

Data Protection via ReplicationHigh Speed W

AN

High Speed W

AN

The “DR” zone utilizes high capacity WOS nodes.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 6

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

CYA: West=1, East=1, DR=1

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

Zone “DR”

Data Protection via ReplicationHigh Speed W

AN

High Speed W

AN

Objects stored utilizing the “CYA” storage policy will result in object replicas on both coasts as well as a copy in the “DR” zone.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

WOS Zones and Replication

Global Content Distribution - 7

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

User Defined Object

Storage Policies:

Wild: West=2, East=0Wallstreet: East=2, West=0Safe: West=1, East=1

High Speed WAN Connection

CYA: West=1, East=1, DR=1

Zone “West”

Data Protection via Replication

Zone “East”

Data Protection via Replication

Zone “DR”

Data Protection via ReplicationHigh Speed W

AN

High Speed W

AN

Regardless of replication policy, any object (OID) can be accessed from any zone, whether the object resides in that zone or not!

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 1

• Replication requires at least two copies of each

object to be stored for a given OID.

• With replication, for maximum performance,

individual objects are stored within 1 disk unit.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 2

!

• Upon disk drive failure, all objects stored on the

failed drive are noted to be out of policy

compliance and recovery begins.

• Affected objects are copied in parallel to bring the

cluster back into full policy compliance.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 3

!

• Upon disk drive failure, all objects stored on the

failed drive are noted to be out of policy

compliance and recovery begins.

• Affected objects are copied in parallel to bring the

cluster back into full policy compliance.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 4

• Policy restoration occurs on a per object basis,

NOT per node, hence only objects that resided on

the failed node will be replicated.

• When the failed node is replaced or returns online,

it simply becomes additional cluster capacity.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 5

• Upon node failure, recovery of all objects stored

on the failed node begins.

• Affected objects are copied in parallel and

distributed to surviving nodes to bring the cluster

back into full policy compliance.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 6

• Upon node failure, recovery of all objects stored

on the failed node begins.

• Affected objects are copied in parallel and

distributed to surviving nodes to bring the cluster

back into full policy compliance.

© 2009 DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks Confidential: Internal Use OnlyDataDirect Networks Confidential: Internal Use Only

Data Protection: Drive and Node

Failure Handling - 7

• Policy restoration occurs on a per object basis,

NOT per disk drive hence only used object space

will be replicated.

• When the failed disk drive is replaced, the

replacement simply becomes additional capacity.

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Data Distribution Evolution

• The “Push Method”

• Data is automatically replicated based on policy to scientists with specific requirements.

• ATLAS as an example;

• The LHC at CERN produces more than 1TB per day.

• Relevant data is replicated to over 300 sites in the US and Canada for analysis.

• Replication is based on specific interest and is fully automated.

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Conclusion

• To enable multi-point scientific study data must have attributes of location (site) as well as traditional ACLs.

• File systems distributing data must have redundant automation to enable replication over geographies.

• Data distribution methods must be simplified to reduce latency, maximize network efficiency allowing improved processing efficiency.

© 2009 DataDirect Networks, Inc. All Rights Reserved.

© 2009 DataDirect Networks, Inc. All Rights Reserved.

Thank You

Dave Fellinger

[email protected]