58
Dave Jaffe, Performance Engineering, VMware Justin Murray, Technical Marketing, VMware VIRT1445BU #VMworld #VIRT1445BU Extreme Performance: Fast Virtualized Hadoop and Spark on All-Flash Disks VMworld 2017 Content: Not for publication or distribution

VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Dave Jaffe, Performance Engineering, VMwareJustin Murray, Technical Marketing, VMware

VIRT1445BU

#VMworld #VIRT1445BU

Extreme Performance: Fast Virtualized Hadoop and Spark on All-Flash Disks

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

2#VIRT1445BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Agenda

#VIRT1445BU CONFIDENTIAL 3

1 Speaker Introductions

2 Review of Big Data Architecture

3 Introduction to the Performance Area

4 Test Configurations

5 Workloads

6 Performance Results

7 Best Practices

8 Tuning

9 Overview of Machine Learning

10 Conclusions

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Our Roles

• Dave is an engineer on the performance team at VMware, focusing on Big Data.

• Justin is in the Technical Marketing area at VMware, where he provides technical information to partners and customers who are deploying big data systems on vSphere

#VIRT1445BU CONFIDENTIAL 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Why the Customer Interest in Big Data?

• Want to get off existing costly data platforms

• Older data warehouse technology is not serving our needs

• Want to do queries and analytics against many different forms of data (structured, unstructured, streaming)

• Provide data access to our customers

• Integrate systems that have been islands till now

– Single source of truth for the enterprise

• Exploit new application architectures for developer productivity

• Want to do data science, machine learning, deep learning

#VIRT1445BU CONFIDENTIAL 5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Worker Node 1 Worker Node 2 Worker Node 3

The Existing Hadoop Architecture

ResourceManager

Client

Datanode

Nodemanager

AppMaster - 1

Nodemanager Nodemanager

Datanode Datanode

HDFS Block 1 HDFS Block 2 HDFS Block 3

Container - 2 Container - 3

Master File System Index

NameNode

submit job

Workers

Master Scheduler

#VIRT1971QU CONFIDENTIAL 6

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

High Level View of Apache Spark

#VIRT1445BU CONFIDENTIAL 7

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Worker Node 1 Worker Node 2 Worker Node 3

The Spark Architecture – Standalone

Driver

Job

Executor

JVM

Executor Executor

JVM JVM

Executor

JVM

Executor

JVM

Executor

JVM

#VIRT1445BU CONFIDENTIAL 8

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

NodemanagerNodemanagerNodemanager

Worker Node 1 Worker Node 2 Worker Node 3

Spark – Implemented on YARN

Job

Datanode

AppMaster - 1

Datanode Datanode

HDFS Block 1 HDFS Block 2 HDFS Block 3

Container - 2 Container - 3

Namenode

Driver Executor Executor

Resourcemanager

#VIRT1445BU CONFIDENTIAL 9

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Introduction

• Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than native

• Last year: saw same conclusion using newer Spark and MapReduce v2 applications running on YARN, in a highly available cluster typical of real world customer configurations

#VIRT1445BU CONFIDENTIAL 10

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Introduction

• The tests to be described in this talk updated the previous studies with

– Better hardware

• 13 servers with faster processors, more cores, larger memory

– All flash disks

– New Spark Machine Learning Library applications

– Additional virtualized configurations

• 1, 2 and 4 VMs per host

• New white paper available: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/bigdata-vsphere65-perf.pdf

#VIRT1445BU CONFIDENTIAL 11

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Test Configurations

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Big Data Cluster

#VIRT1445BU CONFIDENTIAL 13

...

13 Hewlett Packard Enterprise DL380 Gen 9 Servers

1 GbE

Ethernet

Switch

10 GbE

Ethernet

Switch

Each server:2x Intel Xeon E5-2683 v4

CPUs @ 2.10 GHz,16 cores

512 GB Memory 2x 1.2 TB HDD4x 800 GB NVMe12x 800 GB SSD

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Server Configuration

Component Quantity/Type

Server HPE DL 380 Gen 9

Processor 2x Intel Xeon CPU E5-2683 v4 @ 2.10 GHz w/16 cores each

Logical Processors (incl. hyperthreads) 64

Memory 512 GiB (16x 32 GiB DIMMs)

NICs 2x 1 GbE ports + 4 x 10GbE ports

Hard Disk Drives 2x 1.2TB 12G SAS 10K 2.5in HDD – RAID 1 for OS

Non-Volatile Memory Express storage 4x 800GB NVMe PCIe – NodeManager traffic

Solid State Disks 12x 800GB 12G SAS SSD – DataNode traffic

RAID Controller HPE Smart Array P840ar/2G Controller

Remote Access HPE iLO Advanced

#VIRT1445BU CONFIDENTIAL 14

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Flash Disks

• Non-Volatile Memory Express storage

– Low latency solid state disk storage

– Attaches directly to PCI bus

– 4 per server

– Used for NodeManager traffic (high R/W I/O)

• Solid State Disks

– Low latency storage

– Controlled by HPE Smart Array RAID controllers

– 12 per server

– Used for DataNode traffic (large sequential R/W)

#VIRT1445BU CONFIDENTIAL 15

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Virtualized Cluster

#VIRT1445BU CONFIDENTIAL 16

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Bare Metal Cluster

#VIRT1445BU CONFIDENTIAL 17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Worker Node Configuration – Bare Metal vs. Virtualized

#VIRT1445BU CONFIDENTIAL 18

Component Bare Metal 1 VM Per Host 2 VMs Per Host 4 VMs Per Host

Virtual CPUs 64 64 32 16

Memory 512 GiB 480 GiB 240 GiB 120 GiB

Container

Memory

448 GiB 432 GiB 208 GiB 104 GiB

Container vcores 64 64 32 16

NodeManager

Drives

4x 740 GB NVMe 4x 740 GB NVMe 2x 740 GB NVMe 1x 740 GB NVMe

DataNode Drives 12x 741 GB SSD 12x 741 GB SSD 6x 741 GB SSD 3x 741 GB SSD

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Total Cluster YARN Resources

19

Component Bare Metal 1 VM Per Host 2 VMs Per Host 4 VMs Per Host

YARN container memory per VM or

bare metal server

448 GiB 432 GiB 208 GiB 104 GiB

YARN container vcores per VM or

bare metal server

64 64 32 16

Number of VMs or servers per

cluster

10 10 20 40

YARN container memory per cluster 4480 GiB 4320 GiB 4160 GiB 4160 GiB

YARN container vcores per cluster 640 640 640 640

#VIRT1445BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Hadoop/Spark Role Assignments

#VIRT1445BU CONFIDENTIAL 20

Node Roles

Gateway Cloudera Manager, ZooKeeper Server, HDFS JournalNode, HDFS gateway, YARN

gateway, Hive gateway, Spark gateway

Master1 HDFS NameNode (Active), YARN ResourceManager(Standby), ZooKeeper Server,

HDFS JournalNode, HDFS Balancer, HDFS FailoverController, HDFS HttpFS, HDFS NFS

gateway

Master2 HDFS NameNode (Standby), YARN ResourceManager (Active), ZooKeeper Server,

HDFS JournalNode, HDFS FailoverController, YARN JobHistory Server, Hive Metastore

Server, Hive

Workers HDFS DataNode, YARN NodeManager, Spark Executor

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Software Components Used in Test

#VIRT1445BU CONFIDENTIAL 21

Component Version

vSphere 6.5.0, 4564106

Guest Operating System Centos 7.3

Cloudera Distribution of Hadoop 5.10.0

Cloudera Manager 5.10.0

Hadoop, HDFS, YARN, MapReduce2 2.6.0+cdh5.10.0+2102

Spark 1.6.0+cdh5.10.0+457

Hive 1.1.0+cdh5.10.0+859

ZooKeeper 3.4.5+cdh5.10.0+104

Java Oracle 1.8.0_111-b14

MySQL 5.6.35 Community Server

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Workloads

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Workloads – MapReduce

• TeraSort Suite

– Most popular Hadoop test, supplied with distribution, exercises CPU, memory, disk, network

– TeraGen – generates specified number of 100 byte records – 1, 3, and 10 TB used in tests

– TeraSort – sorts TeraGen output

– TeraValidate – validates TeraSort output is in sorted order

– NOTE: TeraSort in MapReduce2 has changed; results not directly comparable to MapReduce1

• TestDFSIO

– Hadoop Distributed File System (HDFS) stress tool, supplied with distribution

– Generates specified number of files of a specified size

– In these tests 1000 1GB, 3GB and 10GB files were created for total size of 1, 3, and 10 TB

#VIRT1445BU CONFIDENTIAL 23

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Workloads – Spark

• Three standard analytic programs from the Spark MLLib (Machine Learning Library) were driven using spark-perf from Databricks, Inc. (https://github.com/databricks/spark-perf)

– K-means Clustering

• Groups input into a specified number, k, of clusters in a multi-dimensional space

• Used for analytic tasks such as customer segmentation for purposes of ad placement or product recommendations

• Training datasets from 1 to 3 TB tested

– Logistic Regression Classification

• Binary classifier – given an input with, say, 20 features, determine if the input falls in a class or not

• Used in spam filters, credit card fraud detectors

• Training datasets from 1 to 3 TB tested

– Random Forest Decision Trees

• Automates any kind of decision making or classification algorithm

• Runs an ensemble of decision trees to in order to reduce the risk of overfitting the training data

• Training datasets from 1 to 3 TB tested

#VIRT1445BU CONFIDENTIAL 24

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Performance Results

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

TeraSort Suite Performance - 1, 3 and 10 TB

#VIRT1445BU CONFIDENTIAL 26

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Results – TeraSort

• Virtualized TeraGen faster than bare metal due to smaller number of disks per DataNode

• Virtualized TeraSort (4 VMs per host) faster than bare metal due to benefits of NUMA (non-uniform memory access) locality, except for 10TB case, where extra memory in bare metal prevails

• Virtualized TeraValidate about same as bare metal (mainly reads)

• Within virtualized platforms 4 VMs per host is fastest, followed by 2, then 1 due to optimum number of disks per DataNode

• Excellent (linear) scaling from 1 to 3 to 10TB

#VIRT1445BU CONFIDENTIAL 27

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

TestDFSIO Performance – 1, 3 and 10 TB

#VIRT1445BU CONFIDENTIAL 28

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Results – TestDFSIO

• Virtualized TestDFSIO (4 VMs per host) significantly faster than bare metal due to benefits of NUMA locality, smaller number of disks per DataNode

– 47.5 GiB/s maximum cluster disk I/O vs. 28.3 for bare metal

• Excellent (linear) scaling from 1 to 3 to 10TB

• Within virtualized platforms 4 VMs per host is fastest, followed by 2, then 1, due to optimum number of disks per DataNode

#VIRT1445BU CONFIDENTIAL 29

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Spark K-means Performance

#VIRT1445BU CONFIDENTIAL 30

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Spark Logistic Regression Performance

#VIRT1445BU CONFIDENTIAL 31

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Spark Random Forest Performance

#VIRT1445BU CONFIDENTIAL 32

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Results – Spark

• Datasets ran in memory, Spark code was NUMA-aware

• Thus virtualized advantage was minimized but 4 VMs per host was still faster due to faster transfer of data within host than through network

• All workloads showed linear scaling as dataset size increased

#VIRT1445BU CONFIDENTIAL 33

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Best Practices

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Best Practices – Hardware Selection

• Memory, CPU increasingly critical for newer technologies like Spark

– CPU: larger core count equally as important as faster clock speed

• Use flash disks appropriately

• Networking – 10GbE crucial, starting to see 25 GbE

• Number of servers determined by size of workload, number of concurrent users

#VIRT1445BU CONFIDENTIAL 35

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Best Practices – Software Selection

• Hadoop Distribution

– Open source Apache Hadoop is available but most production Hadoop users employ a distribution such as Cloudera, Hortonworks or MapR which provides deployment and management tools, performance monitoring, and support

• Operating System

– Each distribution supports a range of Linux operating systems including RedHat/CentOS 6 and 7, SUSE Linux Enterprise Server 11 and 12, and Ubuntu 12 and 14.

• Java JDK

– 1.7 and 1.8

• Database (for management and Hive Metastore)

– MySQL, PostgreSQL, Oracle

• Check distribution for details

#VIRT1445BU CONFIDENTIAL 36

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Best Practices – vSphere NUMA Configuration

• NUMA (non-uniform memory access): A processor’s access to its local memory is faster than to memory on other processors

#VIRT1445BU CONFIDENTIAL 37

Processor

Cache

Memory

Processor

Cache

Memory

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Best Practices – vSphere NUMA Configuration

• Create 2 or more VMs on a 2-processor server to optimize NUMA locality

#VIRT1445BU CONFIDENTIAL 38

Processor

Cache

Memory

Processor

Cache

Memory

VM VM

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Best Practices – vSphere Configuration

• Reserve about 5-6% of total server memory for ESXi, use remainder for VMs

• Limit number of disks per DataNode to maximize utilization of each disk – 4 to 6 is a good starting point

• Use ”Eager Zeroed Thick” format for virtual machine disks (VMDKs), use ext4 or xfs filesystem in guest OS

• Use VMware paravirtual SCSI (pvscsi) adapter for disk controllers; use all 4 virtual SCSI controllers available in vSphere 6.5

• Use vmxnet3 network driver; configure virtual switches with MTU=9000 for jumbo frames

#VIRT1445BU CONFIDENTIAL 39

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Tuning

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 41: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Tuning: Operating System Parameters

• Turn down aggressiveness of memory swapping

– Set vm.swappiness = 0 in /etc/sysctl.conf

• Disable transparent hugepage compaction

– echo never > /sys/kernel/mm/transparent_hugepage/defrag

• Enable jumbo frames on network

– Add MTU=9000 to /etc/sysconfig/network-scripts/ifcfg-e…, configure on physical and virtual switches

#VIRT1445BU CONFIDENTIAL 41

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 42: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Tuning: YARN Cluster Parameters

• yarn.nodemanager.resource.cpu-vcores and yarn.nodemanager.resource.memory-mb

– Tells YARN how many resources it has for containers for tasks/executors

– A vcore is a YARN virtual core

• Can be set 1x - 4x number of physical cores

• Set to 2x number of physical cores in these tests

– = number of hyperthreads (bare metal) = 64

– = number of vCPUs (virtualized) =16 (with 4 VMs per host)

– Container memory:

• server/VM memory - operating system requirements – DataNode/NodeManager JVM heap

• Bare Metal: 512 GiB on server => 448 GiB container memory

• Virtualized: 512 GiB on server – 32 GiB for ESX = 480 GiB

– 4 VMs per host: 480/4 = 120 GiB => 104 GiB container memory

#VIRT1445BU CONFIDENTIAL 42

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 43: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Tuning: Hadoop Job Parameters

• dfs.blocksize – tradeoff between size and number of tasks – 256 MB good initial choice for most workloads

– Set mapreduce.task.io.sort.mb larger than dfs.blocksize to minimize spills to disk – eg. 400 MB

• dfs.replication – 3 typical for availability

• mapreduce.{map|reduce}.memory.mb and mapreduce.{map|reduce}.cpu.vcores

– Memory and vcores to be allocated by YARN for containers to run map and reduce tasks

– Can specify, otherwise YARN will allocate based on other YARN parameters

• mapreduce.job.{maps|reduces}

– Set as needed to override YARN calculation of number of tasks

– Remember that map and reduce tasks normally overlap for part of a job

#VIRT1445BU CONFIDENTIAL 43

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 44: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Tuning: Spark on YARN

#VIRT1445BU CONFIDENTIAL 44

• spark.executor.cores, spark.executor.memory

– Play same role for Spark executors do as map/reduce task memory and vcore assignment do for Map Reduce

• spark.yarn.executor.memoryOverhead

– Set if default (10% of spark.executor.memory) is insufficient

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 45: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Machine Learning – An Overview

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 46: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

46#VIRT1445BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 47: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

What is Machine Learning?

• Machine Learning algorithms try to make predictions based on training data that is given to a mathematical model (e.g. a linear regression algorithm)

• Find the minimum difference between the model’s prediction and the already known outcomes in the labels (i.e. minimize the “loss function”)

• Spark is a foundational technology for this type of application

#VIRT1445BU CONFIDENTIAL 47

Training Data (Big)

New Sample

Transaction Data

Mathematical Model

Classification or PredictionMathematical Model

Mathematical Model

training

Samples from History with Labels

testing

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 48: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Example: A Linear Classifier

#VIRT1445BU CONFIDENTIAL 48

f (xi, W, b) = Wxi + b

Source: Stanford University class cs231nx: Example data

W: weights

b: bias

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 49: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

What Have We Seen so Far?

• Performance results show that virtualized Spark and Hadoop is 10% better than native

• Even better results with All Flash storage than with traditional disks seen last year

• Four virtual machines per server is the sweet spot

• Contemporary workloads such as Machine Learning perform very well on vSphere

#VIRT1445BU CONFIDENTIAL 49

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 50: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Summary

• Each aspect of the stack should be examined using our guidelines for tuning opportunities

• Powerful new technologies like YARN, Spark and Machine Learning apps yield excellent performance on vSphere when tuned properly

– Correctly configured virtualized Hadoop clusters on vSphere outperformed bare metal on all Spark workloads

– Production requirements can be met without sacrificing performance on virtualized environments

• Big Data on vSphere is ready for production environments

• For details see https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/bigdata-vsphere65-perf.pdf

#VIRT1445BU CONFIDENTIAL 50

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 51: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Introducing vSphere Scale-Out for Big Data and HPC Workloads

51

• Hypervisor, vMotion, vShield Endpoint, Storage vMotion, Storage APIs, Distributed Switch, I/O Controls & SR-IOV, Host Profiles / Auto Deploy and more

Features

• Sold in Packs of 8 CPU at a cost-effective price pointPackaging

• EULA enforced for use w/ Big Data/HPC workloads onlyLicensing

New package that provides all the core features required for scale-out workloads at an attractive price point

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 52: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

References

1. Big Data Performance on vSphere 6 https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/bigdata-perf-vsphere6.pdf

2. Virtualized Hadoop Performance with VMware vSphere 6 on High Performance Servers http://www.vmware.com/resources/techresources/10452

3. Virtualized Hadoop Performance with VMware vSphere 5.1 http://www.vmware.com/resources/techresources/10360

4. Benchmarking Case Study of Virtualized Hadoop Performance on vSphere 5 http://vmware.com/files/pdf/VMW-Hadoop-Performance-vSphere5.pdf

5. Hadoop Virtualization Extensions (HVE) http://www.vmware.com/files/pdf/Hadoop-Virtualization-Extensions-on-VMware-vSphere-5.pdf

#VIRT1445BU CONFIDENTIAL 52

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 53: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Extreme Performance Series – Las Vegas

#VIRT1445BU CONFIDENTIAL 53

• SER2724BU Performance Best Practices

• SER2723BU Benchmarking 101

• SER2343BU vSphere Compute & Memory Schedulers

• SER1504BU vCenter Performance Deep Dive

• SER2734BU Byte Addressable Non-Volatile Memory in vSphere

• SER2849BU Predictive DRS – Performance & Best Practices

• SER1494BU Encrypted vMotion Architecture, Performance, & Futures

• STO1515BU vSAN Performance Troubleshooting

• VIRT1445BU Fast Virtualized Hadoop and Spark on All-Flash Disks

• VIRT1397BU Optimize & Increase Performance Using VMware NSX

• VIRT2550BU Reducing Latency in Enterprise Applications with VMware NSX

• VIRT1052BU Monster VM Database Performance

• VIRT1983BU Cycle Stealing from the VDI Estate for Financial Modeling

• VIRT1997BU Machine Learning and Deep Learning on VMware vSphere

• FUT2020BU Wringing Max Perf from vSphere for Extremely Demanding Workloads

• FUT2761BU Sharing High Performance Interconnects across Multiple VMs

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 54: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Extreme Performance Series – Barcelona

#VIRT1445BU CONFIDENTIAL 54

• SER2724BE Performance Best Practices

• SER2343BE vSphere Compute & Memory Schedulers

• SER1504BE vCenter Performance Deep Dive

• SER2849BE Predictive DRS – Performance & Best Practices

• VIRT1445BE Fast Virtualized Hadoop and Spark on All-Flash Disks

• VIRT1397BE Optimize & Increase Performance Using VMware NSX

• VIRT1052BE Monster VM Database Performance

• FUT2020BE Wringing Max Perf from vSphere for Extremely Demanding Workloads

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 55: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Extreme Performance Series - Hand on Labs

• Don’t miss these popular Extreme Performance labs:

• HOL-1804-01-SDC: vSphere 6.5 Performance Diagnostics & Benchmarking

– Each module dives deep into vSphere performance best practices, diagnostics, and optimizations using various interfaces and benchmarking tools.

• HOL-1804-02-CHG: vSphere Challenge Lab

– Each module places you in a different fictional scenario to fix common vSphere operational and performance problems.

#VIRT1445BU CONFIDENTIAL 55

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 56: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

Performance Survey

The VMware Performance Engineeringteam is always looking for feedback about your experience with theperformance of our products, ourvarious tools, interfaces and wherewe can improve.

Scan this QR code to access ashort survey and provide us directfeedback.

Alternatively: www.vmware.com/go/perf

Thank you!

#VIRT1445BU CONFIDENTIAL 56

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 57: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 58: VIRT1445BU Extreme Performance: Fast Virtualized Hadoop or ... · • Previous VMware tests running MapReduce v1 apps show virtualized Hadoop performance at parity or faster than

VMworld 2017 Content: Not fo

r publication or distri

bution