30
Accelerating the adoption of Cloud Computing Beyond Installation: Managing Your OpenStack Cloud May 6 th , 2014

Oreilly solinea-managing-openstack

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Oreilly solinea-managing-openstack

Accelerating the adoption of Cloud Computing

Beyond Installation: Managing Your OpenStack Cloud

May 6th, 2014

Page 2: Oreilly solinea-managing-openstack

!  Ken Pepple is the co-founder and Chief Technology Officer of Solinea

!  Prior to founding Solinea, he led the introduction of Internap's OpenStack-based public cloud services while serving as their Director of Cloud Development

!  Code contributor since Bexar release of OpenStack

!  Author of O'Reilly Media's "Deploying OpenStack" and several other books

2

Speakers

Page 3: Oreilly solinea-managing-openstack

Introduction

!  Installing OpenStack gets all the attention … !  … but distributions like Red Hat OSP and

Cloudscaling are attacking this problem !  They will beat it. Then what ? !  The reality is that OpenStack management is

what we should be focusing on. !  Installation is 2 – 3 weeks … management is

forever

3

Page 4: Oreilly solinea-managing-openstack

OpenStack Architecture

OpenStack Object Store

OpenStack Image Service OpenStack Compute

OpenStackDashboard

OpenStack Identity Service

OpenStack Compute API /

Admin API

keystone(service & admin APIs)

nova-api(OS, EC2, Metadata, Admin)

nova-consoleauth

nova-cert/objectstore

nova-consolenova-*proxy

VNC/Spice

OpenStack Object API

http://www.solinea.com

Queue

nova-compute

nova-scheduler

novadatabase

OpenStackCompute API

OpenStack Image API

Horizon

OpenStack Image API

identity backend

swift-proxy

objectcontaineraccount

objectstore

accountDB

containerDB

OpenStack Object API

HTTP(S)

OpenStackObject API

OpenStack Identity API

OpenStack Identity

API

OpenStack Identity

API

OpenStackImage API

OpenStack Identity

API

OpenStack Image API

catalog backendtoken backend

OpenStack Identity

API

hypervisor

libvirt, XenAPI, etc.

HTTP(S)

Amazon Web Services

EC2 API

Internet / Enterprise Network

OpenStack Network Service

glance-api

glancedatabase

OpenStack Block Storage

OpenStack Block Storage API

cinder-api

cinder-volume

neutron-server

neutron plugin(s)

OpenStack Identity

API

cinder-scheduler

cinderdatabase

OpenStack Network API

networkprovider

OpenStack Block

StorageAPI

OpenStack Network API

policy backend

Queue

OpenStack Network API

neutrondatabase

neutron agent(s)

nova-conductor

memcached

OpenStack Identity

API

⁃ OpenStack Command Line Tools (nova-client, swift-client, etc.)⁃ Cloud Management Tools (Rightscale, Enstratius, etc.)⁃ GUI tools (Cyberduck, iPhone client, etc.)

volume provider

cinder-backup

OpenStack Object API

OpenStack Identity

API

Queue

OpenStack Block Storage API

OpenStack Orchestration

heat-api

heat-engine

heatdatabase

Queue

cloudwatch-api

OpenStack Orchestration API

OpenStack Identity

API

OpenStack Compute API

OpenStack Bock Storage API

OpenStack Network API

glance-api

OpenStack Database

trove-api

trove-taskmgr

trovedatabase

trove-conductor

OpenStack Identity

API

OpenStack Database API

OpenStack Block Storage APIOpenStack Orchestration API

OpenStack Compute API

Agent

Queue

4

* Ceilometer omitted for clarity

Page 5: Oreilly solinea-managing-openstack

OpenStack Deployment

5

Page 6: Oreilly solinea-managing-openstack

OpenStack Management Basics

!  Development and test cluster –  Smaller, but representative –  Same set and version of services –  Reproduce problems, test fixes and practice upgrades

!  Configuration management system –  Chef, Puppet, Ansible, SaltStack, etc. –  Your OpenStack distribution already uses one –  Pick one and stick with it – everything falls under it

!  Skilled and trained staff –  Experienced Linux admins with virtualization skills –  Network architects that understand cloud –  Trained for OpenStack

6

Page 7: Oreilly solinea-managing-openstack

Developing Toolkit for Management

!  Troubleshooting tools –  Operating system level tools –  OpenStack specific tools

!  Administration tools –  OpenStack specific tools

!  Monitoring tools –  Monitoring platforms –  Log management tools

!  Specialized tools –  Image creation

7

Page 8: Oreilly solinea-managing-openstack

Troubleshooting Tools

!  Tools used to investigate or fix problems within your stack

!  Mostly Linux tools but some are specific to OpenStack

!  These need to span virtualization, networking and normal system administration

8

Page 9: Oreilly solinea-managing-openstack

Troubleshooting Hypervisor

!  Vary by hypervisor, each one has it’s own tooling !  Map VM to hypervisor by OpenStack CLI with nova show!

!  Investigate hypervisor through virsh tools !  Also can access backing store for VM through

hypervisor mount point or Cinder volume

9

Page 10: Oreilly solinea-managing-openstack

VM Troubleshooting # nova list!+---------------------+-------+---------+------------+-------------+-------------------------------------+!| ID | Name | Status | Task State | Power State | Networks |!+---------------------+-------+---------+------------+-------------+-------------------------------------+!| f94b097d-b030-473b- | ken | ACTIVE | - | Running | rdonet=192.168.90.11 |!+---------------------+-------+---------+------------+-------------+-------------------------------------+!# nova show f94b097d-b030-473b-86a3-d501091c650b!+--------------------------------------+------------------------------------------------------------+!| Property | Value |!+--------------------------------------+------------------------------------------------------------+!| OS-EXT-AZ:availability_zone | nova |!| OS-EXT-SRV-ATTR:host | localhost.localdomain |!| OS-EXT-SRV-ATTR:hypervisor_hostname | localhost.localdomain |!| OS-EXT-SRV-ATTR:instance_name | instance-0000000e |!| OS-EXT-STS:power_state | 1 |!| OS-EXT-STS:task_state | - |!| OS-EXT-STS:vm_state | active |!| OS-SRV-USG:launched_at | 2014-05-06T06:13:01.000000 |!| created | 2014-05-06T06:11:55Z |!| flavor | m1.small (2) |!| hostId | 7e31bda83a3586907464e8e68f83a035bf9fa500d9579b2b807fa9f0 |!| id | f94b097d-b030-473b-86a3-d501091c650b |!| image | cirros-0.3.2-x86_64 (f66d54e8-f8bd-4220-930f-86b6b44dfe4d) |!| rdonet network | 192.168.90.11 |!| security_groups | default |!| status | ACTIVE |!+--------------------------------------+------------------------------------------------------------+!# vish list!Id Name State!----------------------------------------------------! 1 instance-0000000e running!

10

Page 11: Oreilly solinea-managing-openstack

Troubleshooting Backing Store (Ephemeral)

# cd /var/lib/nova/!# ls!buckets CA images instances keys networks tmp!

# cd instances/!# ll!total 16!drwxr-xr-x. 2 nova nova 4096 May 6 09:03 13e86b72-7e14-43f5-ab2f-e7abf117213f!drwxr-xr-x. 2 nova nova 4096 May 2 11:25 _base!-rw-r--r--. 1 nova nova 45 May 5 23:18 compute_nodes!drwxr-xr-x. 2 nova nova 4096 Apr 30 19:28 locks!

# cd 13e86b72-7e14-43f5-ab2f-e7abf117213f/!# ll!total 208!-rw-rw----. 1 qemu qemu 0 May 6 09:04 console.log!-rw-r--r--. 1 qemu qemu 262656 May 6 09:03 disk!-rw-r--r--. 1 nova nova 79 May 6 09:03 disk.info!-rw-r--r--. 1 nova nova 1529 May 6 09:04 libvirt.xml!

# file disk!disk: Qemu Image, Format: Qcow , Version: 2!

11

The ‘disk’ file is our qcow image. The XML file is

the KVM template.

Page 12: Oreilly solinea-managing-openstack

Troubleshooting Network

!  Combination of Linux, OpenvSwitch and OpenStack tools

!  OpenStack tools will show logical configuration of Neutron’s ports, routers and subnets –  neutron port-list, net-list and router-list!

!  OpenvSwitch will map internal and external bridges –  ovs-vsctl and ovs-dpctl!

!  Linux tools will show you inside VLAN and Linux namespaces –  ip netns, iptables and tcpdump!

12

Page 13: Oreilly solinea-managing-openstack

Network Troubleshooting: Router # ip netns show!qdhcp-8a496b23-ef2c-4170-9919-611d9a12180f!qrouter-e439ff2a-1973-4cda-86a4-20c977eec828!qdhcp-16b5549e-3a1a-4254-b122-f7507f003597!!!# ip netns exec qrouter-e439ff2a-1973-4cda-86a4-20c977eec828 ifconfig!qg-bbe18331-0c Link encap:Ethernet HWaddr FA:16:3E:47:63:D0! inet addr:192.168.57.132 Bcast:192.168.57.255 Mask:255.255.255.0! inet6 addr: fe80::f816:3eff:fe47:63d0/64 Scope:Link! UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1! RX packets:1335 errors:0 dropped:0 overruns:0 frame:0! TX packets:145 errors:0 dropped:0 overruns:0 carrier:0! collisions:0 txqueuelen:1000! RX bytes:197195 (192.5 KiB) TX bytes:13110 (12.8 KiB)!!qr-c4e2b047-4a Link encap:Ethernet HWaddr FA:16:3E:FD:4E:A9! inet addr:192.168.90.1 Bcast:192.168.90.255 Mask:255.255.255.0! inet6 addr: fe80::f816:3eff:fefd:4ea9/64 Scope:Link! UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1! RX packets:364 errors:0 dropped:0 overruns:0 frame:0! TX packets:309 errors:0 dropped:0 overruns:0 carrier:0! collisions:0 txqueuelen:1000! RX bytes:34760 (33.9 KiB) TX bytes:36569 (35.7 KiB)!!!!# ip netns exec qrouter-e439ff2a-1973-4cda-86a4-20c977eec828 netstat -nr!Kernel IP routing table!Destination Gateway Genmask Flags MSS Window irtt Iface!192.168.57.0 0.0.0.0 255.255.255.0 U 0 0 0 qg-bbe18331-0c!192.168.90.0 0.0.0.0 255.255.255.0 U 0 0 0 qr-c4e2b047-4a!

13

Page 14: Oreilly solinea-managing-openstack

Troubleshooting OVS Bridges # ovs-vsctl show!06667946-811b-4c7b-97a5-eafc8386e9ff! Bridge br-int! Port "qvo246622d1-02"! tag: 2! Interface "qvo246622d1-02"! Port "tap3dfc8b70-ee"! tag: 1! Interface "tap3dfc8b70-ee"! Port "tapecef7610-4f"! tag: 2! Interface "tapecef7610-4f"! Port "tapc4e2b047-4a"! tag: 2! Interface "tapc4e2b047-4a"! Port br-int! Interface br-int! type: internal! Bridge br-ex! Port br-ex! Interface br-ex! type: internal! Port "eth1"! Interface "eth1"! Port "tapbbe18331-0c"! Interface "tapbbe18331-0c"! ovs_version: "1.11.0"!

14

Neutron’s integration bridge connecting VMs

Neutron’s external bridge

Physical NIC for internet access

Page 15: Oreilly solinea-managing-openstack

Monitoring

!  Metering is not monitoring –  Ceilometer isn’t a monitoring solution

!  Horizon doesn’t save history !  Monitor for FCAPS: fault, configuration,

accounting, performance and security !  Needs to be instrumented at multiple levels

–  Hardware/Operating System, OpenStack, VM –  Although VM monitoring may be left to the user

!  Needs to be used across all elements and processes

15

Page 16: Oreilly solinea-managing-openstack

Operating System Monitoring

!  Required set of information as any other set of systems –  CPU, memory, availability, etc.

!  Process level information –  RabbitMQ, database, OpenStack processes, etc.

!  Should rely on host sending information to monitoring server (not ping model)

!  Ideally has APIs and strong discovery to aid automation

16

Page 17: Oreilly solinea-managing-openstack

!  Installed as part of many distributions

!  Open source !  Easy installation and

usage !  API is an add-on

module

Nagios

17

Page 18: Oreilly solinea-managing-openstack

Nagios OpenStack Plugin

!  Add service checks for some OpenStack services –  Glance –  Keystone –  Nova –  Swift API and dispersion

!  Available in most Linux distributions –  # sudo apt-get install nagios-plugins-openstack!

!  More information and checks available at http://openstack.prov12n.com/monitoring-openstack-nagios-3/

18

Page 19: Oreilly solinea-managing-openstack

!  Open source monitoring tool used at several large service provider clouds

!  Strong API and discovery modes

!  Templates can be applied to host groups for monitoring

Zabbix Console

19

Page 20: Oreilly solinea-managing-openstack

Zabbix Templates

!  Templates created for each type of server –  Compute nodes, controllers, Swift object servers, etc.

!  Each template checks processes running and configuration management running –  Should issue commands against processes not rely

on process table to catch hung processes !  All nodes also get default OS template !  Alerting set for pagerduty

20

Page 21: Oreilly solinea-managing-openstack

Log Management

!  More than just for error viewing !  Primary source of OpenStack data !  Useful for

–  Finding OpenStack bugs –  Understanding event timings (spin new VM) –  Visualizing cluster level statistics (VMs running) –  Creating dashboards

!  Can be challenging to store, query and interpret data –  Clusters can generate GBs per day –  Use dedicated tools and data stores –  May be required for legal / audit reasons

21

Page 22: Oreilly solinea-managing-openstack

!  Commercial log management solution

!  Visualization, ad hoc queries, post processing and add-ons

!  Easy to setup dashboards

!  Supported with relatively easy installer

Splunk

22

Page 23: Oreilly solinea-managing-openstack

!  Open source alternative to Splunk

!  Requires more complicated setup to parse logs correctly

!  Provides ad hoc queries as well as dashboards

!  Active community

Logstash, Kibana and ElasticSearch

23

Page 24: Oreilly solinea-managing-openstack

Interesting Uses for Log Data

!  VMs –  CPUs/Instances by hypervisor (scheduler efficiency) –  Total vCPUs/CPUs in cluster available versus used –  Spawn success and failures –  Spawn time –  Top Users of VMs/vCPUs

!  Authentication –  Tokens generated versus invalidated –  Failed authentications

!  Errors –  All error messages / stack traces create alert

!  Logs that have stopped (zombie processes)

24

Page 25: Oreilly solinea-managing-openstack

“Canary” Scripts

!  Highest level check for cloud infrastructure: “Can we spin a new VM ?” –  Custom written script that starts VM, attaches block

storage, assign IP address, pings outside world then terminates

–  Logs to all actions with timings into log management solution

!  Run every 5 to 15 minutes !  Also can be run interactively !  This should be written for your own site

25

Page 26: Oreilly solinea-managing-openstack

Specialized Tools

!  Many sites will want to be able to create their own custom images –  CI/CD “golden images”

!  Several commercial and open source alternatives –  CohesiveFT Server3 –  Elasticbox (https://www.elasticbox.com/) –  Packer (http://www.packer.io/)

!  All provide ability to create images with specified software pre-installed via command line

26

Page 27: Oreilly solinea-managing-openstack

Manageability Improvements in Icehouse

!  Nova live upgrade !  Swift discovery API

27

Page 28: Oreilly solinea-managing-openstack

Rolling (“live”) Upgrades

!  Ability to upgrade a running cluster to new release

!  Upgrade controller(s) first then individual compute nodes

!  Requires several pre-conditions –  Neutron upgraded first –  Nova-conductor being used to isolate DB schemas –  Set icehouse compatibility mode

/etc/nova/nova.conf # Set a version cap for messages sent to compute services. If # you plan to do a live upgrade from havana to icehouse, you # should set this option to "icehouse-compat" before beginning # the live upgrade procedure. (string value) compute=icehouse-compat

28

Page 29: Oreilly solinea-managing-openstack

!  API calls to /info will return information about the cluster

!  Users now able to take advantage of the unique features available in each cluster

!  Turned on by default but can be disabled

# swift capabilities!Core: swift! Options:! account_listing_limit: 10000! container_listing_limit: 10000! max_account_name_length: 256! max_container_name_length: 256! max_file_size: 5368709122! max_meta_count: 90! max_meta_name_length: 128! max_meta_value_length: 256! max_object_name_length: 1024! strict_cors_mode: True! version: 1.13.1!Additional middleware: keystoneauth!Additional middleware: staticweb!Additional middleware: tempurl! Options:! methods: ['GET', 'HEAD', 'PUT']!

29

Swift Discoverability

Page 30: Oreilly solinea-managing-openstack