28
1 © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved. Delivering Big Data Workloads as a Service to your Organisation Charles Sevior, CTO | Ryan Tassotti, NAS SE

Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

  • Upload
    ngodien

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

1© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Delivering Big Data Workloads as a Service to your OrganisationCharles Sevior, CTO | Ryan Tassotti, NAS SE

Page 2: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

2© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Why Hadoop?

Oil Exploration Medical Imaging

Video SurveillanceMobile Sensors

Smart Grids

Social MediaInternet of Things

Dark Data

Fast and Cheap Way For Exploiting Massive Amounts of New Data Sources

Page 3: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

3© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Unstructured Data Growth

Total Capacity Shipped, Worldwide Unstructured Data

80%74%

67%

71 EB 133 EB37 EB2013 2015 2017

Source: IDC

Page 4: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

4© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Why a Data Lake?

• Eliminate inefficient islands of storage

• Simplify management and reduce costs

• Enable better information sharing

• Increase data protection and security

• Accelerate data analytics to gain new insight

• Support data-driven decision making

Bring Compute to Data – Efficiently!

Page 5: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

5© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

NASNAS

SANSAN CLOUDCLOUD

TAPETAPE

DASDAS

OBJECTOBJECT

5© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

NEXT-GEN WORKLOADS(3rd Platform)

TRADITIONAL WORKLOADS(2nd Platform)

HPC/EDW

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

Page 6: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

6© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

TAPETAPE

NASNAS DASDAS

CLOUDCLOUDSANSAN

OBJECTOBJECT

Isilon Scale-Out Data Lake

6© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

NEXT-GEN WORKLOADS(3rd Platform)

TRADITIONAL WORKLOADS(2nd Platform)

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

HPC/EDW

Page 7: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

7© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Next-Gen Access Methods

FILEFILEFILE

FILE

7© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

HPC/EDW

Page 8: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

8© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Enterprise-Grade Features

IOPS MBPS

$/GB

DATA PROTECTION

DATA SECURITY PERFORMANCE MANAGEMENT

DATA MANAGEMENTFILE

FILE

FILE

FILE

FILE

FILE

FILE

FILE

Page 9: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

9© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Isilon OneFS: Scale-Out Architecture

Single Volume/ File System

Unmatched Efficiency

Simplicity &Ease of Use

LinearScalability

EasyGrowth

HighPerformance

9© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Page 10: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

10© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

EMC Isilon Scale-Out NAS ArchitectureClients and Applications

RESTful APIGET PUT POST DELETE

Gig-e10 Gig-eNetwork

Storage NodesIsilon OneFS

Multi-ProtocolFile & Object

Client/Application Layer

Ethernet Front-End

Protocols

SMBNFS

FTPHTTP

HDFSfor

Hadoop

RESTfor

Object

InfinibandBack-End

Page 11: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

11© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

IOPSPerformance

Throughput Performance

S210 X410

© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. 11

SMB Multichannel

HDFS 2.3*

OpenStack SWIFT** Available by EOY

Flash as Cache

Accelerated Performance

Up to 1PB Globally Coherent Cache

VCE Converged Infrastructure

Hadoop Big Data Analytics

Platforms

SmartFlash

Access Methods

Solutions

Isilon Product Releases

Page 12: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

LOB1Data

LOB2Data

LOB3Data

I. LOB Siloed Data Sets

II. Removing Silos of Data

III. Early-Stage Predictive Analytics

Data Lake (Hadoop)

LOB1 LOB2 LOB3 LOBn

Data Lake (Hadoop)

DS1 DS2 DS3 DSn

Big Data Analytics

IV.Predictive Enterprise

Data Lake (Hadoop)

Big Data Analytics

DD

A1

DD

A2

DD

A3

DD

An

V. Business Analytics as a Service

DD

A1

DD

A2

DD

A3

DD

An

Hybrid Cloud

Data Lake (Hadoop)

Big Data Analytics

The Third Platform Journey

Page 13: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

13EMC CONFIDENTIAL—INTERNAL USE ONLYEMC CONFIDENTIAL—INTERNAL USE ONLY

Challenge

Large Telco Reduces Response Time for Regulatory Reports from 1 Week to 1 Day

• Distributed and Heterogeneous data infrastructure made it difficult to respond to regulatory report requests

• Data volumes prevented analysis of information across broad timescales

ISILON AND PIVOTAL HD

Solution• Reports which required more than one week to create

can now be turned around same day• 1PB System storage capacity allows the analysis of all

data • Platform combining PHD and Isilon allows the ability

to scale infrastructure for storage without having to worry about compute

Page 14: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

15© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Hadoop Overview

Hadoopis an open-source framework from Apache that allows for parallel batch processing of very large data setsMapReduceis the Hadoop process that divides the workload so multiple devices can process itHDFSis the file system for the data. It provides data protection and locality with multiple mirrors (usually 3 times)

Page 15: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

16© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Hadoop: Concerns for the Enterprise

I want to use my existing

infrastructure, not buy new

hardware

I want to leverage the

tools I already have

I want a low-risk way of trying

Hadoop

My data is in shared storage;

do I have to move it?

Page 16: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

17© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

VMware Big Data Extensions

Rapid Deployment

Self service tools Performance

True multi-tenancy Elastic scaling Avoid dedicated

hardware VM-based isolation Increase resource

utilisation

Deployment choice Maintain

management flexibility at scale

Control Costs Leverage toolsets Security

Operational Simplicity

Maximise Resource Utilisation

Architect Scalable Platform

Page 17: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

18© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

BDE: Deploy Hadoop Clusters in Minutes

From a manual process … To fully automated, using the GUI

Page 18: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

19© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Elastic, Multi-Tenant Virtualised Hadoop

Storage

ComputeCombined   Compute andStorage Storage

Tena

nt 1

Tena

nt 2VM VM VM

VMVM

VM

Unmodified Hadoopnode in a VM VM lifecycle

determinedby Datanode

Limited elasticity

Separate Compute from Storage Separate compute

from data Stateless compute  Elastic compute

Separate Virtual Compute Clustersper tenant Separate virtual compute Compute cluster per tenant Stronger VM‐grade security

and resource isolation

Hadoop Node

Page 19: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

20© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Why Shared Storage For Hadoop?

Page 20: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

21© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Hadoop Bare Metals DeploymentHadoop DAS Environment

1 Dedicated Storage Infrastructure– One-off for Hadoop only

2 Lacking Enterprise Data Protection– No Snapshots, replication, backup

3 Poor Storage Efficiency– 3X mirroring

4 Fixed Scalability– Rigid compute to storage ratio

5 Manual Import/Export– No protocol support

1x

1x

2x

2x

3x

2x

3x

3x

1x

NameNode

Page 21: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

22© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Hadoop On EMC Isilon Scale Out NAS

1 Scale-Out Storage Platform– Multiple applications & workflows

2 End-to-End Data Protection– SnapshotIQ, SyncIQ, NDMP Backup

3 Industry-Leading Storage Efficiency– >80% Storage Utilisation

4 Independent Scalability– Add compute & storage separately

5Multi-Protocol

– Industry standard protocols– NFS, CIFS, FTP, HTTP, HDFS

Page 22: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

23© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

EMC Hadoop Starter Kit

• Support for major Hadoopdistributions

• Quickly deploy, manage, and scale Hadoop clusters

• GUI simplifies management tasks

• Elastic scaling optimizes cluster performance and resource utilisation

Consolidate And Virtualized Hadoop With EMC Isilon And VMware

HDFS

NameNodeNameNodeDataData

name node

name node

name node

name node data node

Apache

https://community.emc.com/docs/DOC-26892

Page 23: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

24© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

www.emc.com/getisilon

Page 24: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

25© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Example Deployment With Pivotal HD• Pre-requisites

– Isilon OneFS version 6.5.5 or higher

– VMware vSphere 5.0 (or later) Enterprise or Enterprise Plus

• Download VMware Big Data Extensions (Free)

• Configure Isilon cluster for HDFS (Free license)

• Configure Big Data Extensions to use Pivotal HD

• Deploy Hadoop Cluster

• Run a simple program to test

Page 25: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

26© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Data Lake Hadoop Bundle For New Customers

Free Hadoop For Existing Customers

HDFS

Pre-tested and configured Big Data analytics solution

Isilon X410 cluster with native HDFS

Free Pivotal HD licenses for 20 compute nodes

HAWQ parallel SQL licenses for 20 compute nodes

FEATURES

Gain powerful analytics capabilities quickly and easily

Reduce costs with highly efficient scale-out storage platform

Ability to leverage expert training and consulting services

Global 24x7 service and support

BENEFITS

Free HDFS license

Free community trial editions of Pivotal HD or Cloudera CHD

Free step-by-step Hadoop Starter Kit with simple directions

Free personalized TCO tool for Hadoop on Isilon vs. DAS

FEATURES

Simple, easy way to unlock value of unstructured data

Jump start Hadoop analytics initiatives quickly

Highly informative and easy-to-use tools

Understand TCO between alternative infrastructure strategies

BENEFITS

Big Data Analytics Solution

Page 26: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

27© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

Splunk Enterprise

“Collect and index any machine-generated data from virtually any source or location in real time. Just point Splunk Enterprise at your data and it will immediately start collecting and indexing--so you can start searching and analysing”

www.splunk.com

Page 27: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

28© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

EMC Splunk Scale Out SimplySplunk Servers Scale OutClustered or Distributed DesignScale out by Blade

XtremIO – HOT/WARM BucketsUp to 20TB XbrickScale Out by Xbrick

Isilon – COLD BucketsUp to 3 x 144TB NodesScale Out by Node

Scale Out Based on CPU Requirements

Scale Out Based on Ingestion Rates

Scale Out Based on Long Term Retention

Page 28: Delivering Big Data Workloads as a Service to ... - Dell EMC · Delivering Big Data Workloads as a ... EMC Isilon Scale-Out NAS Architecture ... OpenStack SWIFT* * Available by EOY

29© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

http://www.emc.com/campaign/isilon-hadoop/index.htm