Big Data Architecture and Deployment - sysage.com.t20151028_150_ActivityFile_2.pdf · FlexPod,...

Preview:

Citation preview

Big Data Architecture and Deployment

Robert Feng

TSA

• Big Data Overview

• Meet New Big Data Requirement - Cisco Common Platform Architecture

• Big Data Applications Co-exist with Enterprise Applications

• Agile Big Data Application Integration with Programmable UCS

• Hadoop Cluster Deployment Automation

• Big Data Performance Enhancement with Cisco ACI

• Summary

Agenda

Big Data Overview

Big Data? So What Has Changed?

The Explosion of Unstructured Data

2005 2015 2010

• More than 90% is unstructured

data

• Approx. 500 quadrillion files

• Quantity doubles every 2 years

• Most unstructured data is neither

stored nor analysed!

1.8 trillion gigabytes of data

was created in 2011…

10,000

0

GB

of

Data

(I

N B

ILL

ION

S)

STRUCTURED DATA

UNSTRUCTURED DATA

Source: Cloudera

Hadoop Server Hardware Evolving in the Enterprise

Typical 2009 Hadoop node

• 1RU server

• 4 x 1TB 3.5” spindles

• 2 x 4-core CPU

• 1 x GE

• 24 GB RAM

• Single PSU

• Running Apache

• $

Economics favor “fat” nodes

• 6x-9x more data/node

• 3x-6x more IOPS/node

• Saturated gigabit, 10GE on the rise

• Fewer total nodes lowers licensing/support costs

• Increased significance of node and switch failure

Typical 2015 Hadoop node

• 2RU server

• 12 x 4TB 3.5” or 24 x 1TB 2.5” spindles

• 2 x 6-12 core CPU

• 2 x 10GE

• 128-256 GB RAM

• Dual PSU

• Running commercial/licensed distribution

• $$$

Hadoop Server Trends

Source: HadoopWorld session, Cloudera speaker

7

• Fat, dense nodes • New features and

applications (Impala, Drill, HBase, etc.) drive RAM demands

• Argues for somewhat higher-end CPA-like configs to provide a cushion for growth

Meet New Big Data Requirement -

Cisco Common Platform Architecture (CPA)

Cisco UCS Common Platform Architecture (CPA) Building Blocks for Big Data

UCS 6200 Series Fabric Interconnects

Nexus 2232 Fabric Extenders

(optional)

UCS Manager

UCS C220/C240 M4 Servers

LAN, SAN, Management

New UCS Reference Configurations for Big Data

Quarter-Rack UCS

Solution for MPP,

NoSQL – High

Performance

Full Rack UCS

Solution for Hadoop

Capacity-Optimised

Full Rack UCS

Solution for Hadoop,

NoSQL – Balanced

2 x UCS 6248

8 x C220 M4 (SFF)

2 x E5-2680v3

256GB

6 x 400-GB SAS SSD

2 x UCS 6296

16 x C240 M4 (LFF)

2 x E5-2620v3

128GB

12 x 4TB 7.2K SATA

2 x UCS 6296

16 x C240 M4 (SFF)

2 x E5-2680v3

256GB

24 x 1.2TB 10K SAS

UCS C3160 Dense Storage Rack Server

Up to 360TB in 4RU

Server Node 2x E5-2600 V2 CPUs

128/256GB RAM 1GB/4GB RAID Cache

Optional Disk Expansion 4x hot-swappable, rear-load

LFF 4TB/6TB HDD

HDD 4 Rows of hot-swappable HDD

4TB/6TB Total top load: 56 drives

Two 120GB SSDs (OS/Boot)

Big Data Applications Co-exist with Enterprise

Applications

Machine

Operational

(OLTP)

Operational

(OLTP) ETL

BI/Reports

Operational

(OLTP)

Enterprise Data Management with Big Data

Web

ETL

Dashboards

Big Data

(Hadoop, etc.)

MPP EDW EDW

Data Center Applications Big Data Applications

Unified Fabric

Unified Management

Integrated

Data

Management

Data Integration Using Connectors

Data Feeds

Cisco Big Data Common Platform

Architecture

Using C-Series Rack-Mount Servers

Cisco UCS B-Series

Blade Servers

SAN

Array

Hadoop

NoSQL

MPP DB

RN FlexPod, Vblock

Ability to manage and monitor enterprise applications running on blades with SAN storage and big

data applications running on rack-mount servers from single pane of glass

Big Data Applications co-exist with Enterprise Applications

Data Center Applications Big Data Applications

Unified Fabric

Unified Management

Integrated

Data

Management

Data Integration Using Connectors

Data Feeds

Cisco Big Data Common Platform

Architecture

Using C-Series Rack-Mount Servers

Cisco UCS B-Series

Blade Servers

SAN

Array

Hadoop

NoSQL

MPP DB

RN FlexPod, Vblock

Ability to manage and monitor enterprise applications running on blades with SAN storage and big

data applications running on rack-mount servers from single pane of glass

Big Data Applications co-exist with Enterprise Applications

Cisco UCS: Physical Architecture: Rack-Mount Server as a Form Factor Extension of Blades

6200

Fabric A

6200

Fabric B

B200

VIC

F

E

X

B

F

E

X

A

SAN A SAN B ETH 1 ETH 2

MGMT MGMT

Chassis 1

Fabric Switch

Fabric Extenders

Uplink Ports

Compute Blades

Half / Full width

OOB Mgmt

Server Ports

Virtualized Adapters

Cluster

Rack Mount C240

VIC

FEX A FEX B Optional, for

scalability

Cisco Virtual Interface Card (VIC)

PCIe x16

10GbE/FCoE

User Definable vNICs

Eth

0

FC

1 2

FC

3

Eth

256

Converged Network Adapter

FCoE in hardware

Bare metal and VM deployments

Virtualize in hardware

PCIe compliant

vNIC Fabric Failover

Up to 256 distinct PCIe devices

Ethernet vNIC and FC vHBA

QoS

8 queues

vNIC bandwidth guarantees

© 2012 Cisco and/or its affiliates. All rights reserved. 18

UCS Rack-Mount

Servers

UCS Blade

Servers

UCS Manager Deploy, Manage, Monitor

Cisco Tidal Enterprise Scheduler

Hadoop Connectors

Big Data

Ecosystem

SAN Arrays

Enterprise Applications

Availability

Backup Snapshot

Cisco UCS Combines Enterprise and Big Data Platform into One – Direct SAN Access

Extendable to Multidata Center Implementations for Disaster Recovery

and Business Continuity

Hadoop Node

Hadoop Node

Hadoop Node

Hadoop Node

Hadoop Node

SQL Node

SQL Node

SQL Node

SQL Node

SQL Node

http://blog.cloudera.com/blog/2015/01/how-to-deploy-apache-hadoop-clusters-like-a-boss/

UCS Fabric Failover

• Fabric provides NIC failover capabilities chosen when defining a service profile

• Avoids traditional NIC bonding in the OS

• Provides failover for both unicast and multicast traffic

• Ideal for bare metal OS deployments

vNIC 1

10GE 10GE

vEth 1

OS / Hypervisor / VM

vEth 1

FEX FEX

Physical

Adapter Virtual

Adapter

6200-A 6200-B L1 L2

L1 L2

Physical Cable

Virtual Cable

Cisco

VIC 1225

Data Center Applications Big Data Applications

Unified Fabric

Unified Management

Integrated

Data

Management

Data Integration Using Connectors

Data Feeds

Cisco Big Data Common Platform

Architecture

Using C-Series Rack-Mount Servers

Cisco UCS B-Series

Blade Servers

SAN

Array

Hadoop

NoSQL

MPP DB

RN FlexPod, Vblock

Ability to manage and monitor enterprise applications running on blades with SAN storage and big

data applications running on rack-mount servers from single pane of glass

Big Data Applications co-exist with Enterprise Applications

• A major market transformation in unified server management

• No management barriers between blades and rack optimized servers

• Extending fabric computing to rack optimized servers

• Add capacity without complexity

Cisco UCS

Fabric Interconnect

UCS Management Administrative Parity for Blades and Rack Servers

Cisco

Fabric Extender

C-Series Rack

Optimized Servers

Unified Management

A Single Unified System

B-Series Blade

Servers

UCS Stateless Computing, Benefits

• Server identity no longer has to be tied to physical server hardware

– Profiles provide identity – Seamless server mobility – Stateless components

• Boot over network (LAN or SAN) – Boot order and boot devices are part of

the pre-defined logical server profile – On-board disks can be used for temp,

swap, etc.

• LAN and SAN Connectivity – # of NIC’s – # of HBA’s

Server Name: Bob

UUID: 56 4d cd 3f 59 5b 61…

MAC : 08:00:69:02:01:FC

WWN: 5080020000075740

Boot Order: SAN, LAN

Chassis-1/Blade-1

Chassis-9/Blade-5

Server Name: Bob

UUID: 56 4d cd 3f 59 5b 61…

UUID: 56 4d cd 3f 59 5b 61…

MAC : 08:00:69:02:01:FC

MAC : 08:00:69:02:01:FD

MAC : 08:00:69:02:01:FE

MAC : 08:00:69:02:01:FF

WWN: 5080020000075740

WWN: 5080020000075740

Boot Order: SAN, LAN

No infrastructure changes needed when moving a Service Profile

SAN LAN

© 2012 Cisco and/or its affiliates. All rights reserved. 24

UCS Rack-Mount

Servers

UCS Blade

Servers

UCS Manager Deploy, Manage, Monitor

Cisco Tidal Enterprise Scheduler

Hadoop Connectors

Big Data

Ecosystem

SAN Arrays

Enterprise Applications

Availability

Backup Snapshot

Cisco UCS Combines Enterprise and Big Data Platform into One – Compute Resource Pooling

Extendable to Multidata Center Implementations for Disaster Recovery

and Business Continuity

Hadoop Node

Hadoop Node

Hadoop Node

Hadoop Node

Hadoop Node

SQL Node

SQL Node

SQL Node

SQL Node

SQL Node SQL Node

Agile Big Data Application Integration with

Programmable UCS

UCS Manager Integration with Big Data Applications

26

UCS Manager Integration with Cloudera Manager – Server Infrastructure Manager – UCS Manager

– Big Data Application Manager – Cloudera Manager

Integration

Programmatic Infrastructure

27

XML API

Direct UCS CLI UCS GUI 3rd Party Customer

Self Serve portals

Management Tools

Auditing Tools

System Status

Physical Inventory

Logical Inventory

Comprehensive XML API

Single point of management – access to all domain knowledge

Broad 3rd party integration support

Faster custom integration for customer use cases

Consistent data and views across ALL interfaces

Cisco Developer Network (DevNet)

28

Developer Community developer.cisco.com/web/unifiedcomputing/home

Cisco UCS Platform Emulator (UCSPE)

Cisco UCS PowerTool PowerShell Library

Demo: Cisco UCS PowerTool

Downloads UCS Platform Emulator (UCSPE)

goUCS Automation Tool

Cisco UCS Powertool (PowerShell Module)

XML API, PowerShell code examples

Microsoft SCOM Management Pack for Cisco UCS

Microsoft SCOM Management Pack for Cisco UCS

Microsoft SCVMM UI Extension for Cisco UCS

Microsoft SCO Integration Pack for Cisco UCS

Documentation Developer Guides

Whitepapers

Reference Guides

Collaboration Blogs, videos and access to subject matter experts

Peer to peer forums

UCS Platform Emulator (UCSPE)

29

Hardware Independent Integration – Downloadable Virtual Machine

– Full feature emulator for UCS Manager

– Complete support for XML API calls

– Object Browser to navigate UCSM MIT

– Import and replicate physical UCS Manager physical inventory

– Share physical inventories among UCS Platform Emulators

– Drag-n-drop hardware builder to create custom physical inventory

UCS Python SDK

30

Hardware Independent Integration

Cisco UCS Python SDK is a python module which helps automate all aspects of Cisco UCS management including server, network, storage and hypervisor management

Bulk of the Cisco UCS Python SDK work on the UCS Manager’s Management Information Tree (MIT), performing create, modify or delete actions on the Managed Objects (MO) in the tree.

All the physical and logical components that comprise Cisco UCS are represented in a hierarchical Management Information Model (MIM), referred to as the Management Information Tree (MIT). Each node in the tree represents a Managed Object (MO), uniquely identified by its Distinguished Name. (DN)

Hadoop Cluster Deployment Automation

UCS Director Express for Big Data

32

Unified Management Platform for UCS Hadoop Cluster – Wire once, deploy Hadoop anytime

– Zero Touch Deployment of UCS and Hadoop Infrastructure

– Integrated Topology view of Hadoop Nodes and underlying Compute/Network/Storage

– Simplified and Integrated Management with reduced TCO

– Enables advanced diagnostics and monitoring

Hadoop Manager UCS Manager

UCSD Express for

Big Data

Unified Management

UCSD Express

UCS 6200 Series

Fabric Interconnect

UCS Manager

UCS C240 M4 Series

Rack Server

UCS C3160 Rack

Server

Unified Management with UCSD Express for Big Data Programmability, Scalability and Automation

OS Profile Cisco UCS Template

Hadoop

Big Data Performance Enhancement with Cisco ACI

Application centric infrastructure (ACI)

ACI - a Holistic Architecture Enabling Rapid Deployment of Applications onto Networks

with Scale, Security and Full Visibility

ACI

APPLICATION CENTRIC

POLICY CONTROLLER NEXUS 9000 FABRIC

100 150 200 250 300

ACI

Traditional Network

Time (s)

Case Study – Big Data Analytics

Based on common network load and link failure scenarios

ACI Innovation Driving Application Performance

Congestion Management

60% 60%

90%

Network Innovations

Dynamic Load Balancing

Dynamic Packet Prioritization

30% reduction

in application

completion time

Network Utilization

Summary

Summary

• New Big Data Challenges require re-thinking of new enterprise scale infrastructure

• Leverage UCS and Nexus9k/ACI to integrate big data into your data center operations

UCS ACI

Thank you

Recommended