25
Julien Danjou jd__@Freenode // @juldanjou [email protected] Ceilometer Heat Alarming Alarming Nick Barcet nijaba@Freenode // @nijaba [email protected] Eoghan Glynn eglynn@Freenode [email protected] ceilometer

Ceilometer + Heat = Alarming

Embed Size (px)

DESCRIPTION

Ceilometer is a tool that collects usage and performance data, while Heat orchestrates complex deployments on top of OpenStack. Heat aims to autoscale its deployments, scaling up when they're running hot and scaling back when idle. Ceilometer can access decisive data and trigger the appropriate actions in Heat. The result of these two OpenStack projects meeting is value creation in the form of an alarming API in Ceilometer and its consumption in Heat. Slides presented at the Fall OpenStack Design Summit in Hong Kong

Citation preview

Page 1: Ceilometer + Heat = Alarming

Julien Danjoujd__@Freenode // @juldanjou [email protected]

Ceilometer Heat Alarming

Alarming

Nick Barcetnijaba@Freenode // @nijaba

[email protected]

Eoghan Glynneglynn@Freenode

[email protected]

ceilometer

Page 2: Ceilometer + Heat = Alarming

Speakers● Nick Barcet co-founded the Ceilometer project at the

Folsom summit and led the project through incubation

● Julien Danjou has been a core Ceilometer contributor from the outset, taking over the PTL reins for Havana

● Eoghan Glynn drove the addition of the Alarming feature to Ceilometer over the Havana cycle

Page 3: Ceilometer + Heat = Alarming

Two seemingly disjoint projects intersect● Heat is a template-driven orchestration engine

○ automates complex deployments via declarative configuration

● Ceilometer is a metering infrastructure○ collects data measuring resource usage and

performance

● Appear on the surface to have minimal commonality ...

Page 4: Ceilometer + Heat = Alarming

Ceilometer Workflow

● Collect from OpenStack components● Transform metering data if necessary● Publish meters to any destination (including

Ceilometer itself)● Store received meters● Aggregate samples via a REST API

Collect Transform Publish Store Aggregate

Page 5: Ceilometer + Heat = Alarming

Heat Workflow { "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }

my_stack.template

Page 6: Ceilometer + Heat = Alarming

Heat Workflow { "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }

consumed by

Heat Engine

Page 7: Ceilometer + Heat = Alarming

Heat Workflow consumed by

Heat Engineinteracts with

{ "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }

Page 8: Ceilometer + Heat = Alarming

Heat Workflow { "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }

consumed by

Heat Engineinteracts with

spins upInstance Volume

my_stack

Page 9: Ceilometer + Heat = Alarming

Heat Autoscaling v1.0

reports load

push-stats

CW-lite

my_stack

Page 10: Ceilometer + Heat = Alarming

Heat Autoscaling v1.0

Heat Engine

my_stack

reports load scales out

stack

InstanceInstance

Instance

Page 11: Ceilometer + Heat = Alarming

Heat Autoscaling v1.0

Heat Engine

my_stack

reports load scales out

stack

InstanceInstance

InstanceInstance

Page 12: Ceilometer + Heat = Alarming

Ceilometer to the rescue!● compute agent already collects most

relevant stats from outside the instance● API service exposes aggregation over the

evaluation window● define new API exposing alarm lifecycle● provide new service to evaluate alarms

against their defined rules● additional service driving asynchronous

notifications when alarms fire

Page 13: Ceilometer + Heat = Alarming

How it all hangs together{ "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }

added to template● alarms bounding busy/idleness of

instances● membership of autoscale group

represented via user metadata● alarm actions refer to scale

up/down policies● action URLs are pre-signed● policies define adjustment step size

& cooldown period

Page 14: Ceilometer + Heat = Alarming

How it all hangs together"CPUAlarmHigh": {

"Type": "OS::Metering::Alarm",

"Properties": {

"meter_name": "cpu_util", threshold: "75" "evaluation_periods": "5", "period": "60", "statistic": "avg", "comparison_operator": "gt", "description": "Scale-up if CPU > 75% for 300s",

"alarm_actions":[…"ScaleUpPolicy", "AlarmUrl"…], "matching_metadata": {

"metadata.user_metadata.server_group":

"MyWebServerGroup"}}}

Page 15: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

injects user metadata

Instance

my_stack

Page 16: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

injects user metadata

Instance

my_stack

API service

creates alarms

Ceilom

eter

Page 17: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

injects user metadata

Instance

my_stack

API service

Compute Agent

creates alarms

monitors instances

Ceilom

eter

Page 18: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

injects user metadata

Instance

my_stack

API service

Compute Agent

creates alarms

Alarm evaluator

monitors instances

triggers alarm

Ceilom

eter

Page 19: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

injects user metadata

my_stack

API

Compute

Alarms

alarming

scales out stack

InstanceInstance

Instance

Ceilom

eter

Page 20: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

injects user metadata

my_stack

API

Compute

Alarms

alarming

scales out stack

InstanceInstanceInstance

InstanceInstance

Ceilom

eter

Page 21: Ceilometer + Heat = Alarming

How it all hangs together

Heat Engine

my_stack

Instance

API service

Compute Agent

Alarm evaluatorreports

samples

provides alarm rules

queries statsMeter store

Ceilom

eter

Page 22: Ceilometer + Heat = Alarming

Lessons learnedKeys to successful intra-project interactions:

● buy-in from stakeholders on both sides● early validation and proof-points● protect consuming project from churn during

the development cycle● split deliverables into bite-sized separately

consumable chunks

Page 23: Ceilometer + Heat = Alarming

● expand metering coverage to also capture:○ memory utilization %○ LBaaS statistics○ network & disk I/O rates

● add combination alarm support to Heat templates○ allow thresholds over multiple metrics to be modeled

● exclude low-quality datapoints○ avoid scaling when only outliers have reported metrics

Future directions

Page 24: Ceilometer + Heat = Alarming

● monitor baremetal via IPMI or SNMP○ autoscale groups of hosts managed as Ironic instances

● constrain alarms for time-of-day or day-of-week○ e.g. set the bar higher on weekends, lower on weekdays

● decouple autoscaling usage from Heat templates

● authenticate webhook calls with keystone trusts○ avoid ec2-signer use without keystone EC2 tokens ext

Future directions

Page 25: Ceilometer + Heat = Alarming

Further questions?● Chat on Freenode:

○ #openstack-metering○ #heat

● Mail the dev list:○ [email protected]

● Harangue us via Launchpad:○ https://launchpad.net/ceilometer/+filebug