25
Non-Stop Hadoop Enterprise Ready Hadoop Presentation for Big Data Meetup October 8, 2014

SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

Embed Size (px)

DESCRIPTION

How To Achieve Non-Stop Hadoop

Citation preview

Page 1: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

Non-Stop Hadoop Enterprise Ready Hadoop Presentation for Big Data Meetup October 8, 2014

Page 2: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

2   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

WANdisco Background

•  WANdisco: Wide Area Network Distributed Computing –  Enterprise ready, high availability software solutions that enable globally distributed

organizations to meet today’s data challenges of secure storage, scalability and availability •  Leader in tools for software engineers – Subversion

–  Apache Software Foundation sponsor •  Highly successful IPO, London Stock Exchange, June 2012 (LSE:WAND) •  US patented active-active replication technology granted, November 2012 •  Global locations

–  San Ramon (CA) –  Chengdu (China) –  Tokyo (Japan) –  Boston (MA) –  Sheffield (UK) –  Belfast (UK)

Page 3: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

3   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Customers

Page 4: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

4   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Non-Stop Hadoop

Non-Intrusive Plugin

Provides Continuous Availability In the LAN / Across the WAN

Active/Active

Page 5: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

5   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

3 Key Problems For Multi Cluster Hadoop LAN / WAN

Page 6: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

6   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Enterprise Ready Hadoop Characteristics of Mission Critical Applications

•  Require 100% Uptime of Hadoop –  SLA’s, Regulatory Compliance

•  Require HDFS to be Deployed Globally –  Share Data Between Data Centers –  Data is Consistent and Not Eventual

•  Ease Administrative Burden –  Reduce Operational Complexity –  Simplify Disaster Recovery –  Lower RTO/RPO

•  Allow Maximum Utilization of Resource –  Within the Data Center –  Across Data Centers

Page 7: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

7   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Single Standby •  Inefficient utilization of resource

–  Journal Nodes –  ZooKeeper Nodes –  Standby Node

•  Performance Bottleneck •  Still tied to the beeper •  Limited to LAN scope

Active / Active •  All resources utilized

–  Only NameNode configuration –  Scale as the cluster grows –  All NameNodes active

•  Load balancing •  Set resiliency (# of active NN) •  Global Consistency

Breaking Away from Active/Passive What’s in a NameNode

Page 8: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

8   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Standby Datacenter •  Idle Resource

–  Single Data Center Ingest –  Disaster Recovery Only

•  One way synchronization –  DistCp

•  Error Prone –  Clusters can diverge over time

•  Difficult to scale > 2 Data Centers –  Complexity of sharing data

increases

Active / Active •  DR Resource Available

–  Ingest at all Data Centers –  Run Jobs in both Data Centers

•  Replication is Multi-Directional –  active/active

•  Absolute Consistency –  Single HDFS spans locations

•  ‘N’ Data Center support –  Global HDFS allows appropriate

data to be shared

Breaking Away from Active/Passive What’s in a Data Center

Page 9: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

9   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

One Cluster Approach

•  Example Applications

–  HBASE –  RT Query –  Map Reduce

•  Poor Resource Management

–  Data Locality Issues –  Network Use –  Complex

Multiple Clusters

Page 10: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

10   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Creating Multiple Clusters

•  Example Applications

–  HBASE –  RT Query –  Map Reduce

•  Need to share data between clusters

–  DistCp / Stale Data –  Inefficient use of

storage and or network

–  Some clusters may not be available

Multiple Clusters

Page 11: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

11   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Cluster Zones Zoning for Optimal Efficiency

1 100%

HDFS  

Consistency  

Page 12: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

12   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Multi Datacenter Hadoop Disaster Recovery

WAN  REPLICATION    

Absolute  Consistency  Maximum  Resource  Use  

Lower  Recovery  Time/Point    

Replicate  Only  What  You  Want  BeCer  UFlizaFon  of  Power/Cooling  

Lower  TCO  LAN  Speed  Performance  

 

Page 13: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

Technical Overview Hadoop Powered by WANdisco

Page 14: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

14   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Periodic Synchronization DistCp

Parallel Data Ingest Load Balancer, Streaming

Multi Data Center Hadoop Today What's wrong with the status quo

Page 15: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

15   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Periodic Synchronization DistCp

Multi Data Center Hadoop Today Hacks currently in use

•  Runs as Map reduce •  DR Data Center is read only •  Over time, Hadoop clusters

become inconsistent •  Manual and labor intensive

process to reconcile differences •  Inefficient use of the network

Page 16: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

16   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Parallel Data Ingest Load Balancer, Flume

Multi Data Center Hadoop Today Hacks currently in use

•  Hiccups in either of the Hadoop cluster causes the two file systems to diverge

•  Potential to run out of buffer when WAN is down

•  Requires constant attention and sys-admin hours to keep running

•  Data created on the cluster is not replicated

•  Use of streaming technologies (like flume) for data redirection are only for streaming

Page 17: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

17   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

DConE Distributed Coordination Engine

•  WANdisco’s patented WAN capable paxos implementation –  Mathematically proven –  Provides distributed co-ordination of File system metadata

•  Active/Active (All locations) •  Create, Modify, Delete •  Shared nothing (No Leader)

•  No restrictions on distance between datacenters –  US Patent granted for time independent implementation of Paxos

•  Not based on SAN block device synchronization such as EMC SRDF –  SAN block replication has distance limits resulting from the inability of file systems

such as NTFS and ext4 to tolerate long RTTs to block storage –  Possible distribution of corrupted blocks

PAXOS

Paxos is a family of protocols for solving consensus in a network of unreliable processors.

Consensus is the process of agreeing on one result among a group of participants.

This problem becomes difficult when the participants or their communication medium may experience failures.

Page 18: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

18   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

•  Majority Quorum –  A fixed number of participants –  The Majority must agree for change

•  Failure –  Failed nodes are unavailable –  Normal operation continue on nodes

with quorum

•  Recovery / Self Healing –  Nodes that rejoin stay in safe mode

until they are caught up

•  Disaster Recovery –  A complete loss can be brought back

from another replica

How DConE Works WANdisco Active/Active Replication

TX  id:  168  TX  id:  169  TX  id:  170  TX  id:  171  TX  id:  172  TX  id:  173  

TX  id:  168  TX  id:  169  TX  id:  170  TX  id:  171  TX  id:  172  TX  id:  173  

TX  id:  168  TX  id:  169  TX  id:  170  TX  id:  171  TX  id:  172  TX  id:  173  

Proposal  170  

Agree  170  

Agree  170  

Proposal  171  Agree  172  Agree  173  

Agree  171  Proposal  172  Proposal  173  

B  

A  

C  Agree  170  Agree  171   Agree  172  

Agree  173  

Page 19: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

19   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Architecture of a Non-Stop Hadoop

Page 20: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

20   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Use Cases

•  Eliminate The Performance Bottleneck of a Single Active NameNode •  Multi Data-Center Ingest

–  Information doesn't need to be sent to one DC and then copied back to the other using DistCP –  Parallel ingest methods don’t require redirected data streams –  Ingest data at, or close to the source –  Global Analysis (Logs, Click Streams, etc…)

•  Cluster Zones –  Efficient use of resource based on application profile –  HBASE, IMPALA, Storm, Map Reduce, SPARK, etc… –  Heterogeneous Clusters Supported

•  Maximize Data Center Resource Utilization –  All datacenters can be used to run different jobs concurrently

•  Disaster Recovery –  Data is as current as possible (no periodic synchs) –  Virtually zero downtime to recover from regional data center failure –  Regulatory compliance

Page 21: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

21   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

•  Optimized hardware profiles for job specific tasks –  Batch –  Real-time –  NoSQL (HBASE)

•  Set replication factors per sub-cluster

•  Use at LAN or WAN scope

•  Resilient to NameNode failures

Use Case: Heterogeneous Hardware

Page 22: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

22   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

•  Maximize Resource Utilization –  No idle standby

•  Isolate Dev and Test Clusters –  Share data not resource

•  Carve off hardware for a specific group

–  Prevents a bad map/reduce job from bringing down the cluster

•  Guarantee Consistency and availability of data

–  Data is instantly available

Use Case: Sub-Clusters

Page 23: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

23   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Non-Stop Hadoop Demonstration

Page 24: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

24   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Q & A

Question and Answer Feel free to submit your questions

Page 25: SD Big Data Monthly Meetup #4 - Session 2 - WANDisco

25   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA

Thank you