68

Red Hat OpenStack Performance & Scale...OPENSTACK CORE PROJECTS OpenStack Compute (NOVA) Core compute service comprised of Compute Nodes – hypervisors that run virtual machines Supports

  • Upload
    others

  • View
    61

  • Download
    0

Embed Size (px)

Citation preview

Red Hat OpenStack

Performance & Scale

Mark Wagner Senior Principal Engineer, Red Hat June 12, 2013

Agenda

● Introduction / High Level Overview

● Test Strategy

● Test Results

● Wrap Up

● Reference

OPENSTACK ARCHITECTURE

● Modular architecture

● Designed to easily scale out

● Based on (growing) set of core services

OPENSTACK CORE PROJECTS

OpenStack Identity (KEYSTONE)

● Identity Service

● Common authorization framework

● Manages users, tenants and roles

● Pluggable backends (SQL, PAM, LDAP, etc)

OPENSTACK CORE PROJECTS

OpenStack Compute (NOVA)

● Core compute service comprised of

● Compute Nodes – hypervisors that run virtual machines● Supports multiple hypervisors KVM, Xen, LXC, Hyper-V and ESX

● Distributed controllers that handle scheduling, API calls, etc● Native OpenStack API and Amazon EC2 compatible API

OPENSTACK CORE PROJECTS

OpenStack Image Service (GLANCE)

● Image service

● Stores and retrieves disk images (virtual machine templates)

● Supports Raw, QCOW, VMDK, VHD, ISO, OVF & AMI/AKI

● Backend storage : Filesystem, Swift, Amazon S3

OPENSTACK CORE PROJECTS

OpenStack Object Storage (SWIFT)

● Object Storage service

● Modeled after Amazon's S3 service

● Provides simple service for storing and retrieving arbitrary data

● Native API and S3 compatible API

OPENSTACK CORE PROJECTS

OpenStack Networking (formerly QUANTUM)

● Network Service

● Provides framework for Software Defined Network (SDN)

● Plugin architecture

● Allows integration of hardware and software based network solutions

OPENSTACK CORE PROJECTS

OpenStack Block Storage (CINDER)

● Block Storage (Volume) Service

● Provides block storage for virtual machines (persistent disks)

● Similar to Amazon EBS service

● Plugin architecture for vendor extensions

eg. NetApp driver for Cinder

OPENSTACK CORE PROJECTS

OpenStack Dashboard (HORIZON)

● Dashboard

● Provides simple self service UI for end-users

● Basic cloud administrator functions

● Define users, tenants and quotas● No infrastructure management

OPENSTACK INCUBATING PROJECTS

OpenStack Orchestration (HEAT)

● Dashboard

● Provides simple self service UI for end-users

● Basic cloud administrator functions

● Define users, tenants and quotas● No infrastructure management

OPENSTACK INCUBATING PROJECTS

OpenStack Monitoring and Metering (CEILOMETER)

● Dashboard

● Provides simple self service UI for end-users

● Basic cloud administrator functions

● Define users, tenants and quotas● No infrastructure management

RED HAT UPSTREAM FOCUS

● Heavily engaged in community since 2011

● Established leadership position in community

● Both in terms of governance and technology

● Including PTLs on Nova, Keystone, Oslo, Heat and Ceilometer

● Creating and leading stable tree

● 3rd largest contributor to Essex Release

● 2nd largest contributor to Folsom Release

● Largest contributor to Grizzly Release

● Note: These statistics do not include external dependencies

eg. libvirt, kvm, Linux components

RED HAT UPSTREAM FOCUS

http://bitergia.com/public/reports/openstack/2013_04_grizzly/

Leading Contributor to Grizzly Release

● Leading in commits and line counts across all projects

RED HAT UPSTREAM FOCUS

http://bitergia.com/public/reports/openstack/2013_04_grizzly/

Core Projects All Activity

BUILDING A COMMUNITY● RDO Project

● Community distribution of OpenStack● Packaged for *EL6 and Fedora● Freely available without registration

● Vanilla distribution – closely follows upstream

● Upstream release cadence● 6 month lifecycle – limited updates based on

upstream

RDO vs. RED HAT OPENSTACK

Latest OpenStack code, major and minor releases

Enterprise-hardened OpenStack code, major and minor releases

Six month release cadence mirroring community release cadence

Six month release cadence offset from community releases to allow for hardening and certification testing

Short lifecycle for bug fixes, patches, features Enterprise lifecycle for long term production deployments, including bug fixes, patches, and feature backports

No explicit hardware, software, or services certification

Certified hardware, software and services through the Red Hat OpenStack Certified Partner program

Community support Supported by Red Hat support with support subscription

Installs on CentOS, Scientific Linux, Fedora, Red Hat Enterprise Linux (unsupported)

Installs on Red Hat Enterprise Linux only

Agenda

● Introduction / High Level Overview

● Test Strategy● Test Results

● Wrap Up

● Reference

OpenStack Performance Testing

Multiple approaches needed

● Infrastructure

● Guest Performance

● Component Integration with other Red Hat products

● Working with the Industry

OpenStack Performance Testing

Evaluating the Infrastructure

● What to test

● How to test

● Identify hot spots / areas of concern

OpenStack Performance Testing

Evaluating the Infrastructure

● First tackle what to test● Work with developers to get their input● Look at architecture and see for ourselves

● Database pops out● Keystone

● This looks pretty easy right ?

Fairly Simple ViewHORIZON

SWIFT

CINDER

QUANTUM

NOVA GLANCE

KEYSTONE

Test Strategy

The next page has a slightly more involved flow

● Not really legible so don't strain your eyes● Demonstrates the complexity

● How would you test this?

OpenStack Performance Testing

Evaluating the Infrastructure

● Next tackle how to test● What components can be tested “standalone”● Determine configuration to be used● How to scale

OpenStack Performance Testing

Evaluating the Infrastructure

● How to test● What components can be tested “standalone”

● Simplifies testing● Cuts down on infrastructure costs● Great for isolating performance issues● Need to make sure testing are valid

OpenStack Performance Testing

Evaluating the Infrastructure

● How to test● Determine configuration to be used

● Virtual machine● Bare metal● Both have their merits

OpenStack Performance Testing

Evaluating the Infrastructure

● How to scale ?● Have a limited number of machines available● Initial thrust is to go with “Virtual Hosts”

● Use RHEVM to create an environment for scale● Allows to configure a single guest● Create a template● Create Pools from template● Fire up guests● Tell packstat that they are hosts

● Initial testing is with 100

OpenStack Performance Testing

Scale Testing of Infrastructure

● Areas of concern include● Messaging

● Good news is qpid is well understood by our team● Used as one of our “standard” tests for networking

● How much does each component need for resources● Trending towards running components in unique vms● Openstack on Openstack

● Database performance● What / when to tune● Our team has lots of experience here as well

Agenda

● Introduction / High Level Overview

● Test Strategy

● Test Results

● Wrap Up

● Reference

OpenStack Performance Testing

Lets get Real !

● One initial concern was Keystone● How “chatty' is it ?● Found several issues

● Multiple calls to keystone for new tokens● Nothing prunes the database● Inefficiencies in CLI vs curl calls

OpenStack Performance Testing

Keystone findings

● Multiple calls to keystone for new tokens

● Database grows with no cleanup● As tokens expire they should eventually get removed

● Should help with indexing● For every 250K rows responses goes up 0.1 secs

Horizon Login 3 Tokens

Horizon Image page 2 Token

CLI (nova image-list) 2 Tokens

OpenStack Performance Testing

Keystone Findings

● Inefficiencies in CLI vs curl calls

● nova image-show ● Executes in 2.639s

● curl -H “ “● Executes in .555s

● Tracing of CLI shows that python is reading the data one byte at a time

● Known httplib issue in the python standard library

OpenStack Guest Performance

Focused on Nova Compute Nodes

● Really boils down to a RHEL / KVM “issue”● We have experience here● Good news is RHEL / KVM has industry leading

performance numbers.

SPECvirt2010: RHEL 6 KVM Post Industry Leading Results

http://www.spec.org/virt_sc2010/results/

Virtualization Layer and HardwareBlue = Disk I/OGreen = Network I/O

Client Hardware

System Under Test (SUT)

> 1 SPECvirt Tile/core> 1 SPECvirt Tile/core

Key Enablers: SR-IOV

Huge Pages

NUMA

Node Binding

VMware ESX 4.1 HP DL380 G7 (12 Cores, 78 VMs)

RHEL 6 (KVM) IBM HS22V (12 Cores, 84 VMs)

VMware ESXi 5.0 HP DL385 G7 (16 Cores, 102 VMs)

RHEV 3.1 HP DL380p gen8 (16 Cores, 150 VMs)

VMware ESXi 4.1 HP BL620c G7 (20 Cores, 120 VMs)

RHEL 6 (KVM) IBM HX5 w/ MAX5 (20 Cores, 132 VMs)

VMware ESXi 4.1 HP DL380 G7 (12 Cores, 168 Vms)

VMware ESXi 4.1 IBM x3850 X5 (40 Cores, 234 VMs)

RHEL 6 (KVM) HP DL580 G7 (40 Cores, 288 VMs)

RHEL 6 (KVM) IBM x3850 X5 (64 Cores, 336 VMs)

RHEL 6 (KVM) HP DL980 G7 (80 Cores, 552 VMs)

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

1,221 1,367 1,5702,442

1,878 2,1442,742

3,8244,682

5,467

8,956

Best SPECvirt_sc2010 Scores by CPU Cores

(As of May 30, 2013)

System

SP

EC

virt

_sc

20

10

sco

re

Comparison based on best performing Red Hat and VMware solutions by cpu core count published at www.spec.org as of May 17, 2013. SPEC® and the benchmark name SPECvir_sct® are registered trademarks of the Standard Performance Evaluation Corporation. For more information about SPECvirt_sc2010, see www.spec.org/virt_sc2010/.

2-socket 162-socket 12 2-socket 20

4-socket 40

8-socket 64/80

SPECvirt2010: Red Hat Owns Industry Leading Results

OpenStack Guest Performance

Current work focused on Nova Compute Nodes

● Diving into XML

● System configurations

● Storage layouts

● Over commit ratios

OpenStack Guest Performance

● Should expect basically the same out of the box performance as RHEL / KVM

● Added tuned virtual-host to the Nova compute node configuration

● RHOS generates its own XML file to describe the guest● About the same as virt-manager

● One big area of focus will be guest placement● User specified CPU pinning goes against the Cloud philosophy

Nova Out of the Box Performance

1vm 4vm 8vm 12vm0

100000

200000

300000

400000

500000

600000

RHOS vs libvirt/KVM

java workload

RHOS Libvirt/KVM untuned

bo

ps

OpenStack Guest Performance

Guest tunings

● One big area of focus will be guest placement● User specified CPU pinning goes against the Cloud

philosophy ● Not currently planned to implemented

● Desire to leverage NUMA characteristics when feasible

OpenStack Guest Performance

Guest tunings

● One big area of focus will be guest placement● User specified CPU pinning goes against the Cloud

philosophy ● Not currently planned to implemented

● Hello numad !● A tool in RHEL that can align guests withing NUMA boundaries● Our team has lots of experience with this in RHEL / KVM● Currently testing with Nova● More detail in Thurs 10:40 AM talk “Performance Analysis &

Tuning” Room 302

Four NUMA node system,fully-connected topology

Node 0 RAM

QPI links, IO, etc.

Core 0

Core 3

Core 1

Core 2

L3 Cache

Node 1 RAM

QPI links, IO, etc.

Core 0

Core 3

Core 1

Core 2

L3 Cache

Node 2 RAM

QPI links, IO, etc.

Core 0

Core 3

Core 1

Core 2

L3 Cache

Node 3 RAM

QPI links, IO, etc.

Core 0

Core 3

Core 1

Core 2

L3 Cache

Node 3

Node 1Node 0

Node 2

Sample remote access latencies

4 socket / 4 node: 1.5x

4 socket / 8 node: 2.7x

8 socket / 8 node: 2.8x

32 node system: 5.5x● (30/32 inter-node latencies >= 4x)

10 ( 32/1024: 3.1%) 13 ( 32/1024: 3.1%) 40 ( 64/1024: 6.2%) 48 (448/1024: 43.8%) 55 (448/1024: 43.8%)

So, what's the NUMA problem?

● The Linux system scheduler is very good at maintaining responsiveness and optimizing for CPU utilization

● Tries to use idle CPUs, regardless of where process memory is located....

● Using remote memory degrades performance!● Red Hat is working with the upstream community to increase

NUMA awareness of the scheduler and to implement automatic NUMA balancing.

● Remote memory latency matters most for long-running, significant processes, e.g., HPTC, VMs, etc.

numad can help improve NUMA performance● New RHEL6.4 user-level daemon to automatically

improve out of the box NUMA system performance, and to balance NUMA usage in dynamic workload environments

● Was tech-preview in RHEL6.3

● Not enabled by default

● See numad(8)

numad aligns process memory and CPU threads within nodes

Node 0Node 0Node 0 Node 2Node 1 Node 3

Process 37

Process 29

Process 19

Process 61

Node 0Node 0Node 0 Node 2Node 1 Node 3

Proc 37Proc

29

Proc19 Proc

61

Before numad After numad

Guest Performance – Nova over commit

Nova has some aggressive over commit ratios

● CPU has an over commit of 16● RHEV is much lower● Our team is investigating now● Will need multiple suggestions based on the instance

workload● Memory over commit is a much lower 1.5

● Again depends on the workload● Anything memory sensitive falls off the cliff if you need

to swap

Test Strategy

1vm 4vm 8vm 12vm 16vm 20vm 32VM 48VM0

100000

200000

300000

400000

500000

600000

RHOS Java workload vm scaling

bops

Nova Compute Configuration

Look at ephemeral storage configuration

● Help determine guidelines for balancing ephemeral storage performance vs cost / configuration

● How much storage do you want in each compute server● Initial cost / configuration● Rack space 1U vs 2U● Power / cooling

● How does network based storage perform● Need to ensure proper network bandwidth

Nova Compute Configuration

Look at ephemeral storage configuration

● In this test, each instance uses a different image

● Concurrent boots ● Tested boot times using

● Single system disk● Seven disk internal array● Similar seven disk array NFS mounted from another box● Fiber channel SSD drives● Fiber channel SSD via NFS● 10 Gbit network

Impact of tuned on instance boot times

The next two slide demonstrate the impact of tuned virtual-host on IO performance

● You can clearly see the positive impact of tuned● Stopped testing when boot time was unacceptable

● Don't misinterpret the NFS data● Single compute node so not network bound● NFS uses server side caching● Did gather all the data points due to expected “flat line”

Impact of tuned on Nova Boot Times

2vm 4vm 6vm 8vm 10vm 12vm 14vm 16vm 18VM 20VM0

10

20

30

40

50

60

70

80

Nova Boot Times (Multiple Images)

no tuned

System Disk Array SSD NFS (Array) NFS (SSD)

aver

age

bo

ot

tim

es (

secs

)

Nova Boot Times

2vm 16vm0

10

20

30

40

50

60

70

Nova Boot Times (Multiple Images)

virt-host profile

System Disk Stripe SSD NFS (Stripe) NFS (SSD)

aver

age

bo

ot

tim

e (s

ecs)

Snapshot Performance

● OpenStack snapshot is different from RHEV

● Backing image and qcow => new image

● Via qemu-img convert mechanism● Written to temporary snap directory● This destination is a tunable

Impact of Storage Configuration on Snapshots

1 Snap 2 Snap 4 Snap 6 Snap0

20

40

60

80

100

120

140

160

180

200

RHOS Concurrent Snapshot Timings (qemu-img convert only)

System Disk Array NFS (Array)

Concurrent Snapshots

Ave

rag

e S

na

psh

ot

TIm

e (

secs

)

Swift Performance

● Some of the performance team has been driving the Swift / RHS integration work

● Found issues in both RHS and Swift

● Changes accepted upstream● Now able to focus on tuning● One tunable for RHS is to turn off directory listings

Swift Performance

● Work has shown promising results

● Chunk sizes● In Swift● In Filesystem

● Disabling accurate_size_in_listing (RHS)

● Introduced new parameter max_clients (Swift)

● Tweaking workers as needed ( Swift )

More Worker Count, Max Clients, Chunk Size

1 4 7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

52

55

58

61

64

67

70

73

76

79

82

85

88

91

94

97

10

0

10

3

10

6

10

9

11

2

11

5

11

8

12

1

12

4

12

7

13

0

13

3

13

6

13

9

14

2

14

5

14

8

0

0.5

1

1.5

2

2.5

3

150 30 MB Objects Transferred Sequentially

10 Ge, Concurrent w Four Other Clients (3K, 30K, 300K, 3M)

30 MB Default 30 MB Tuned

Object # (Sequential)

Se

con

ds

pe

r O

bje

ct

3K Client Finished

30K Client Finished

300K Client Finished

Working with the Industry

Our team continues to be involved in industry benchmark consortia.

● Define and development

● Actively involved with sub committees

● SPEC, TPC, STAC● Cloud, Virt

● Work with partners on their efforts

Agenda

● Introduction / High Level Overview

● Test Strategy

● Test Results

● Wrap Up

● Reference

Wrap Up

● OpenStack is a rapidly evolving platform

● Out of the box performance is already pretty good

● RHEL / KVM● One of the main issues is the brain shift needed

● Don't want to hand tune 1000's of servers● Continue to improve out of the box performance● System configuration can play a key part

Questions

?

Agenda

● Introduction / High Level Overview

● Test Strategy

● Test Results

● Wrap Up

● Reference

06/12 Sessions

Time Title

10:40 AM – 11:40 AM Introduction to Red Hat OpenStack

2:30 PM - 3:30 PM Introduction & Overview of OpenStack for IaaS Clouds

3:40 PM - 4:40 PM Red Hat IaaS Overview & Roadmap

3:40 PM - 4:40 PM Integration of Storage,OpenStack & Virtualization

06/13 Sessions

Time Title

10:40 AM – 11:40 AM KVM Hypervisor Roadmap & Technology Update

2:30 PM - 3:30 PM Migrating 1,000 VMs from VMware to Red Hat Enterprise Virtualization: A Case Study

3:40 PM - 4:40 PM War Stories from the Cloud: Lessons from US Defense Agencies

4:50 PM - 5:50 PM Red Hat Virtualization Deep Dive

4:50 PM - 5:50 PM Red Hat Enterprise Virtualization Performance

4:50 PM - 5:50 PM Real world perspectives: Gaining Competitive Advantages with Red Hat Solutions

06/14 Sessions

Time Title

11:00 AM - 12:00 PM Network Virtualization & Software-defined Networking

9:45 AM - 10:45 PM Hypervisor Technology Comparison & Migration

Reference Architectures

Two places to get Red Hat reference architectures

● Red Hat resource library www.redhat.com

● Free

● Red Hat customer portal https://access.redhat.com

● Requires user account

● Scripts and configuration files provided