Download pdf - A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters

A Performance Comparison of Container-‐based Virtualiza8on Systems for MapReduce Clusters

Miguel G. Xavier, Marcelo V. Neves, Cesar A. F. De Rose [email protected]

Faculty of Informa8cs, PUCRS Porto Alegre, Brazil

February 13, 2014

Outline

•  Introduc8on •  Container-‐based Virtualiza8on •  MapReduce •  Evalua8on •  Conclusion

Introduc8on •  Virtualiza8on

•  Allows resources to be shared •  Hardware independence, availability, isola8on and security •  BeUer manageability •  Widely used in datacenters/cloud compu8ng

•  MapReduce Cluster and Virtualiza8on •  Usage scenarios

•  BeUer resource sharing •  Cloud Compu8ng

•  However, hypervisor-‐based technologies in MapReduce environments has tradi8onally been avoided

Container-‐based Virtualiza8on •  A group o processes on a Linux box, put together in a

isolated environment •  A lightweight virtualiza8on layer •  Non virtualized drivers •  Shared opera8ng system

Hardware

Host OS

Virtualization Layer

Guest Processes

Guest Processes

Hardware

Virtualization Layer

Guest Processes

Guest Processes

Guest OS Guest OS

Container-based Virtualization Hypervisor-Based Virtualization

Host OS

Container-‐based Virtualiza8on •  Each container has:

•  Its own network interface (and IP Address) •  Bridged, routed …

•  Its own filesystem •  Isola8on (security)

•  container A and B can’t see each other •  Isola8on (resource usage)

•  RAM, CPU, I/O •  Current systems

•  Linux-‐Vserver, OpenVZ, LXC

Container-‐based Virtualiza8on •  Implements Linux Namespaces

•  Mount – moun8ng/unmou8ng file systems •  UTS – hostname, domainname •  IPC – SysV message queues, semaphore, memory segments •  Network – IPv4/IPv6 stacks, rou8ng, firewall, /proc/net,

sock •  PID – Own set of pids Chroot is filesystem namespace

•  Current systems •  Linux-‐Vserver, OpenVZ, LXC

Container-‐based Systems •  Linux-‐VServer

•  Implements its own features in Linux kernel •  limits the scope of the file system from different processes

through the tradi8onal chroot •  OpenVZ

•  Linux Containers (LXC) •  Based on CGroups

Hypervisor-‐ vs Container-‐based Systems

Hypervisor Container Different Kernel OS Single Kernel Device Emula8on Syscall Many FS caches Single FS cache Limits per machine Limits per process High Performance Overhead Low Performance Overhead

MapReduce •  MapReduce •  A parallel programming model •  Simplicity, efficiency and high scalability •  It has become a de facto standard for large-‐scale data analysis

•  MapReduce has also aUracted the aUen8on of the HPC

community •  Simpler approach to address the parallelism problem •  Highly visible case where MapReduce has been successfully

used by companies like Google, Yahoo!, Facebook and Amazon

MapReduce and Containers •  Apache Mesos •  Shares a cluster between mul8ple different frameworks •  Creates another level of resource management •  Management is taken away from cluster’s RMS

•  Apache YARN •  Hadoop Next Genera8on •  BeUer job scheduling/monitoring •  Uses virtualiza8on to share a cluster among different

applica8ons

Evalua8on •  Experimental Environment

•  Hadoop cluster composed by 4 nodes •  Two processors with 8 cores (without threads) per node •  16GB of memory per node •  146GB of disksize per node

•  Analyze of the best results of performance •  Through micro-‐benchmarks

•  HDFS evalua8on (TestDFSIO) •  NameNode evalua8on (NNBench) •  MapReduce evalua8on (MRBench)

•  Through macro-‐benchmarks (WordCount, TeraBench) •  Analyze of best results of isola8on

•  Through IBS benchmark

•  At least 50 execu8ons were performed for each experiment

HDFS Evalua8on

•  Semngs: •  Replica8on of 3 blocks •  File size from 100 MB to

3000 MB

•  All Container-‐based systems have performance similar to na8ve

•  Results o OpenVZ represents loss of 3Mbps

•  It is due to the CFQ scheduler

0

5

10

15

20

25

30

0 1000 2000 3000File size (Bytes)

Thro

ughp

ut (M

bps)

lxcnativaovzvserver

HDFS Evalua8on •  All of Container-‐based

systems obtained performance results similar to na8ve

•  Linux-‐VServer uses a

Physical-‐based network

0

5

10

15

20

25

30

0 1000 2000 3000File size (Bytes)

Thro

ughp

ut (M

bps)

lxcnativaovzvserver

NameNode Evalua8on using NNBench

•  NNBench benchmark was chosen to evaluate the NameNode component •  Linux-‐VServer reaches a latency at a average of 48ms, while LXC obtained the

worst result at an average of 56ms •  The differences are not so significant if the numbers are considered •  However, the strengths are that no excep8on was observed during the high

HDFS management stress, and that all systems were able to respond effec8vely as the na8ve

Na8ve LXC OpenVZ VServer

Open/Read (ms) 0.51 0.52 0.51 0.49

Create/Write (ms) 54.65 56.89 51.96 48.90

•  Generates opera8ons on 1000 files on HDFS

MapReduce Evalua8on using MRBench

•  The results obtained from MRBench show that MR layer suffers no substan8al effect while running on different container-‐based virtualiza8on systems

Na8ve LXC OpenVZ VServer

Execu8on Time 14251 13577 14304 13614

Analyzing Performance with WordCount

0

20

40

60

80

100

120

140

160

180

Wordcount

Exec

utio

n Ti

me

(sec

onds

)

NativeLXCOpenVZVServer

•  30 GB of input data

•  The peak of performance degrada8on from OpenVZ is explained by the I/O scheduler overhead

Analyzing Performance with TeraSort

0

20

40

60

80

100

120

140

Terasort

Exec

utio

n Ti

me

(sec

onds

)

NativeLXCOpenVZVServer

•  Standard map/reduce sort •  Steps: •  Generates 30 GB of input

data •  Run on such input data.

•  A HDFS block size of 64MB

Performance Isola8on

Container A

Container A

Container B

Base line applica8on

Base line applica8on

Stress Test

Execu8on Time Execu8on Time

Performance degrada8on (%)

Performance Isola8on

CPU Memory I/O Fork Bomb

LXC 0% 8.3% 5.5% 0%

•  We chose LXC as the representa8ve of the container-‐based virtualiza8on to be evaluated

•  The limits of the CPU usage per container is working well •  no significant impact was noted. •  a liUle performance degrada8on needs to be taken into account •  The fork bomb stress test reveals that the LXC has a security subsystem that

ensure feasibility

Conclusions •  we found that all container-‐based systems reach a near-‐na8ve performance for

MapReduce workloads •  the results of performance isola8on reveled that the LXC has improved its

capabili8es of restrict resources among containers •  although some works are already taking advantages of container-‐based

systems on MR clusters •  this work demonstrated the benefits of using container-‐based systems to

support MapReduce clusters

Future Work

•  We plan to study the performance isola8on at the network-‐level •  We plan to study the scalability while increasing the number of

nodes •  We plan to study aspects regarding the green compu8ng, such as

the trade-‐off between performance and energy consump8on

Thank you for your aUen8on!