37
Introduction To GlusterFS

Introduction to GlusterFS Webinar - September 2011

Embed Size (px)

DESCRIPTION

Looking for a high performance, scale-out NAS file system? Or are you a new user of GlusterFS and want to learn more? This educational monthly webinar provides an introduction and review of the GlusterFS architecture and key functionalities. Learn how GlusterFS is deployed in the datacenter, in the cloud, or between the two.

Citation preview

Page 1: Introduction to GlusterFS Webinar - September 2011

Introduction To GlusterFS

Page 2: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 2

How To Ask a Question?

Some Housekeeping Items…

Ask a question at any time

Questions will be answered at

the end of the webinar

Slides will be available after

the webinar

The webinar is being

recorded

Page 3: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 3

Heather Wellington Marketing Manager

Gluster, Inc.

Tom Trainer Director

Product Marketing

Gluster, Inc.

Today’s Speakers

Page 4: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 4

Poll Question

Are you using GlusterFS today? – Yes, in a test environment

– Yes, it‟s deployed in a production environment

– No, however we are considering it

– Just researching

Page 5: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 5

History of Gluster

How it all started – Backgrounds in high performance, clustered computing

– Working at Lawrence Livermore National Labs

• AB Periasamy & Hitesh Chellani design “Thunder”

• One of the worlds fastest super computers

• On Intel commodity hardware

• Solved filesystem scalability and performance limitations

– Large customer in oil & gas persuaded them to focus on storage

– Gluster founded by Hitesh & AB to bring technology to market

Result: award winning technology

Thunder

Page 6: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 6

What is the Gluster File System?

Scale-out file storage software for – Network Attached Storage (NAS)

– Object

– Big Data / Analytics

GlusterFS provides – Flexibility to deploy in ANY environment

– High availability

– Unified files and objects

– File system for Apache Hadoop

– Scalability to Petabytes & beyond

– Linearly scalable performance

– Superior storage economics

August 23rd

Page 7: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 7

GlusterFS Architecture Design Goals

Innovation – Eliminate metadata

– Dramatically improve time

– Unify files and objects

Elasticity – Flexibility adapt to growth/reduction

– Add, delete volumes & users

– Without disruption

Scale linearly – Multiple dimensions

• Performance

• Capacity

– Aggregated resources

Simplicity – Ease of management

– No complex Kernel patches

– Run in user space

Capacity

Per

form

ance

Page 8: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 8

Key Differentiators

Software only

Open source

Modular, stackable storage OS architecture

Data stored in native formats

No metadata – Elastic hashing

Unified files and objects

Apache Hadoop file system compatible replacement

Filesystem runs in user space

Virtual Machine (VM) virtual motion enabler

Page 9: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 9

Software Only - Future Proofing Storage

Hardware agnostic

Superior storage economics & flexibility – Data center / private cloud use commodity hardware

– Public cloud – i.e. AWS, RackSpace, GoGrid, Nimbula – pay for only what you need

No lock-in – Hardware vendors-at purchase time or in the future

– Any Cloud – Public, private, and hybrid

– Performance, capacity, or availability levels

– GlusterFS – not proprietary, files are stored in native formats (i.e. EXT4)

Page 10: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 10

Open Source

200,000+ downloads – ~16,000 /month

550+ registered deployments – 45 countries

2,500+ registered users – Mailing lists, Forums, etc.

Active community – Diverse testing environments

– Bugs identification and fixes

– Code contributions

Member of broader ecosystem – OpenStack, Linux Foundation, Open

Virtualization Alliance

Global Adoption

Page 11: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 11

Elastic Hashing

No metadata server

An algorithmic approach – Unique hash tag for each file stored

– Tags stored within the file system

– Rapid file read – low latency

Traditional Central Metadata Server

Traditional Distributed Metadata Server

Innovative Elastic Approach

Page 12: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 12

A Standard Gluster Deployment

Gluster Global Namespace (HTTP, NFS, CIFS, Gluster Native) Application Data Objects

Clients/Apps Clients/Apps Clients/Apps

IP Network

VMs

VMDK VMDK

virtual storage pool

ApacheTM HadoopTM

Standard clients running

standard apps

Over any standard IP

network

Access to application data,

as files and folders and or

objects, in global

namespace, using a variety

of standard protocols

Stored in a commoditized,

virtualized, scale-out,

centrally managed pool of

DAS, SAN, NAS

Page 13: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 13

Unified On-premise, Private and Public Cloud Storage

Client/Apps Client/Apps

Data Center / Private Cloud Public Cloud

Replication

Gluster Global Namespace

Client/Apps Client/Apps

Client/Apps Client/Apps

Client/Apps Client/Apps

Client/Apps

IP Network

Page 14: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 14

Deployment Scenarios Common Solutions Built on GlusterFS

Media serving (CDN)

Large scale file storage

Tier 2 & 3 archive

File sharing

High Performance Computing (HPC) storage

IaaS storage layer

Disaster recovery

Backup & restore

Private cloud

Page 15: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 15

Introducing GlusterFS 3.3 Beta 1

Next generation file and object storage – The first system for data storage that enables you to store and access data as

an object and as a file

– Flexible and powerful, it simplifies access and management of data

– Eases migration of legacy, file-based applications to object storage for use in

the cloud

Public beta availability: July 20, 2011 – Broad community testing and participation

– Selected enterprise customer engagements

Page 16: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 16

The “Traditional” Unified Approach

Proprietary

Bolt-on hardware approach

– Combined hardware raises costs

– Higher TCO

– Paying for what you many not need

Increased risk

– Common hardware elements can fail

• Power supplies

• Fans

• Cabling…lots of cabling

Files Objects DB’s

Carved Up Storage Pool

NAS Object

Traditional Monolithic Hardware

Bolt-on Approach (i.e. EMC VNX)

Block “VNX reminds me of my old VHS, DVD and cable box….

….one thing fails and I’m blown out of the water.”

Beta Customer , 2011

Page 17: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 17

Software Approach to File and Object

Network Attached Storage (NAS)

– NFS / CIFS / GlusterFS

– POSIX compliant

– Access files within objects

Window Access

– Improves Windows performance

– Uses HTTP, not slower CIFS

– We will still support SAMBA

Object Storage

– API

– Internet Protocol (IP)

– ResTFul

– Get/Put

– Buckets

– Objects seen as files

Standards based

– Amazon S3 ReSTFul interface

compatible

– Access data as objects and a NAS

interface to access files (NFS, CIFS,

GlusterFS)

High performance storage across heterogeneous server environments

Page 18: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 18

GlusterFS 3.3 Unified File & Object Storage

Widely deployable and extremely flexible – On-premise, virtualized and in public and private clouds

– Deep unification of file and object data storage

• Not just unified at the management layer

• Not a bolt-together hardware product

– Access data within objects as files

– Compatible with Amazon Web Services

• S3

• Create S3 on EC2 and EBS

– Back up objects from the data center to AWS

– Enable S3 functionality in the data center

– Built to run on commodity hardware

Page 19: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 19

GlusterFS 3.3 Easing & Accelerating Legacy App Migration

Enables cloudification of applications

– Removes remaining storage hurdles related to file only

access

– Allows for gradual app migration to the cloud

– Enables moves to both private and public cloud

infrastructures

Gluster FS 3.3

Object Storage

Data Center, Virtual

Public, Private Cloud

Unified file & object storage accelerates legacy app migration to the cloud

Enterprise Apps / Enterprise Data Center

Ob

ject In

fo

Page 20: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 20

Gluster 3.3 Unified File & Object – Use Cases

Data Center – Take control of cloud services

– Reduce AWS S3 costs

– Deliver S3 like global services in-house

– Legacy application migration

IaaS – Deliver File and S3 Services

– Unified file and object

• Competitive differentiator

– Drastically reduce storage costs

– Increase offerings, revenues and margins

Traditional Data Center

Private Cloud

IaaS

S3

Data Center: S3 in house &/ integrate with S3

IaaS: Deliver S3 to clients

Page 21: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 21

Introducing GlusterFS Compatibility for Apache Hadoop

Enhancement to GlusterFS providing a new file system

option for Apache Hadoop – Proven scale-out storage solution provides simultaneous file and object access within the

Hadoop

– Introduces a 4th storage option for Hadoop (HDFS, local disk, Kosmos)

Included in GlusterFS 3.3 beta 2 available August 23, 2011 – Community beta for testing and participation

– Select enterprise customer engagements

Requirements driven by community and customer requests – “Eliminate the 64MB fixed block size imposed by HDFS”

– “Eliminate the centralized metadata server”

– “Give us NAS” (via POSIX compliance)

Benefits – Out of the box compatibility with MapReduce applications, no rewrite required

– Enables organizations to unify data storage

– Flexible and powerful, it simplifies access and management of data

Page 22: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 22

MapReduce

HDFS

Seamless Integration for Hadoop Deployments

GlusterFS can co-exist HDFS

Does NOT use the NameNode metadata server

Built using the Hadoop file system API – Requires simple configuration file changes

– C Lib Gluster client enable Gluster direct access

– Java Client

• JNI interface

– gluster_hadoop.jar

• Provides Java binding for Hadoop compatibility

MapReduce

HDFS

MapReduce

HDFS

MapReduce

HDFS

MapReduce

GlusterFS

NameNode

GlusterFS

Metadata

Server

Page 23: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 23

Seamless Integration for Hadoop Deployments

Co-exists or replace HDFS

Ultimately eliminate the need for the NameNode

Faster access times – faster filesystem

All the features and benefits of GlusterFS

Metadata

Server NameNode

MapReduce MapReduce MapReduce MapReduce MapReduce

GlusterFS GlusterFS GlusterFS GlusterFS HDFS GlusterFS

Page 24: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 24

Metadata

Server NameNode

Seamless Integration for Hadoop Deployments

NameNode metadata server eliminated

Faster access times – faster filesystem

All the features and benefits of GlusterFS

MapReduce MapReduce MapReduce MapReduce MapReduce

GlusterFS GlusterFS GlusterFS GlusterFS GlusterFS

Page 25: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 25

Why It’s Different?

No metadata server – No single point of failure, automated self heal and failover

– No performance bottleneck on data lookups for fast file access

Built in replication – Synchronous for inter-node replication

– Asynchronous for geo-replication

No block size restrictions – Ideal for small and large files

POSIX compliant file system – Out of the box NFS, CIFS and Gluster native access

Expanded data access options – File and object access to data

– Access files from your object interface and access data within objects as files

– File based applications can access data without modification

Reduces requirement for replicated files from 3 to 2 – 33% capacity savings

Page 26: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 26

Potential Uses for GlusterFS and Hadoop

Simplify and unify storage deployments – Centralized data store providing access to more applications

Provide users with file level access to data – Users can easily brose data using off the shelf tools

Enable legacy applications to access data via NFS – Analytic apps can access data without modification

Enable object base access to data – Modern applications can use object based access to data

Page 27: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 27

User Space

Filesystem Runs in User Space

GlusterFS

Server (CPU/Mem)

1 TB

1 TB

1 TB

1 TB

1 TB

1 TB

1 TB

1 TB

1 TB

1 TB

1 TB 1 TB

Kernel

1 TB 1 TB

Not tied to kernel

No reassemblies

Independence

Page 28: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 28

The Gluster Connector for OpenStack – July 27, 2011

SWIFT

Enables GlusterFS to be the underlying file system

Connects GlusterFS to Xen and KVM hypervisor

– Unified File and Object storage

– Highly-available, scale-out NAS

– Alternative to SWIFT

OpenStack Imaging Services

Unified File &

Object Storage

… Compute

API Layer

Mobile Apps. Web Clients. Enterprise Software Ecosystem

OpenStack Prior to Gluster

OpenStack with Gluster

Page 29: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 29

The Gluster Connector for OpenStack – July 27, 2011

Connector enables GlusterFS to be chosen as the filesystem

– Provides:

• Unified File and Object storage

• Highly scalable NAS

• High Availability – synchronous and asynchronous replication

• Preferred, scalable alternative to SWIFT

• Virtual motion of virtual machines (a.k.a. vmotion)

GlusterFS

Server (CPU/Mem)

Hypervisor

VM

GlusterFS

Server (CPU/Mem)

Hypervisor

VM

VM

VM

VM

Virtual storage pool

VM

VM

VM

Page 30: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 30

Pandora Internet Radio

Problem • Explosive user & title growth

• As many as 12 file formats for each song

• „Hot‟ content and long tail

Solution • Three data centers, each with a six-node

GlusterFS cluster

• Replication for high availability

• 250+ TB total capacity

Benefits • Easily scale capacity

• Centralized management; one administrator

to manage day-to-day operations

• No changes to application

• Higher reliability

• 1.2 PB of audio served

per week

• 13 million files

• Over 50 GB/sec peak

traffic

Page 31: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 31

Brightcove

Problem • Cloud-based online video platform

• Explosive customer & title growth

• Massive video in multiple locations

• Costs rising, esp. with HD formats

Solution • Complete scale-out based on commodity

DAS/JBOD

• Replication for high availability

• 1PB total capacity

Benefits • Easily scale capacity

• Centralized management; one administrator

to manage day-to-day operations

• Higher reliability

• Path to multi-site

• Over 1 PB currently in

Gluster

• Separate 4 PB project

in the works

Page 32: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 32

Cincinnati Bell Technology Solutions

Problem • Host a dedicated enterprise cloud solution

• Large scale VMware environment

• Need high availability

Solution • Gluster for VM storage, NFS to clients

• SAS drives on back-end

• Replication for high availability

Benefits • Storage provisioning from 6 wks to 15 min.

• Vendor agnostic storage

• Low cost of service delivery

• Elastic growth

• Large scale VM

storage

• Low cost service

delivery for enterprise

customer

• Drastic reduction in

provisioning time

Page 33: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 33

Problem • Capacity growth from 144TB to 1+PB

• Multiple distributed users/departments

• Multi OS access - Windows, Linux and Unix

Solution • GlusterFS Cluster

• Solaris/ZFS/x4500 w/ InfiniBand

• Native CIFS/ NFS access

Benefits • Capacity on demand / pay as you grow

• Centralized management

• Higher reliability

• OPEX decreased by 10X

Partners Healthcare

• Over 500 TB

• 9 Sun “Thumper”

systems in cluster

Private Cloud: Centralized Storage as a Service

Page 34: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 34

Gluster Enterprise Deployment Options

Storage Software Appliance – Deploy on bare metal

– Any hardware on Red Hat Hardware HCL

Virtual Machines – Deployable on the leading virtual machines

Amazon Web Services (AWS) – Runs within Amazon Machine Image (AMI)

RightScale Cloud Management – GlusterFS managed via a RightScale ServerTemplate

– Deployable via the RightScale Cloud Management Dashboard

GoGrid Cloud – Gluster Server Image (GSI) for scale-out NAS on GoGrid cloud

On-premise/datacenter

Virtualization/private cloud

Public cloud

Page 35: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 35

Many Enterprises Rely on Gluster Now

Page 36: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 36

Summary

GlusterFS is scale-out storage – NAS

– Object

– Big Data / Analytics

Flexibility, scalability, superior economics

OpenStack cloud – Unified file and object storage

– Virtual machine (VM) virtual motion

Innovative architecture provides a better way to do

storage

Page 37: Introduction to GlusterFS Webinar - September 2011

A Better Way To Do Storage 37

Your turn - ask our experts

Questions and Answers

Try Gluster for free here: http://www.gluster.com/trybuy/

Additional resources here: http://www.gluster.com/products/resources/

Join the community: http://www.gluster.org/

Follow on twitter: @gluster.

Read our blog: http://blog.gluster.com/

Contact us at: [email protected] or 1-800-805-5215