Upload
nick-nicolas-barcet
View
2.450
Download
0
Embed Size (px)
DESCRIPTION
Ceilometer is a tool that collects usage and performance data, while Heat orchestrates complex deployments on top of OpenStack. Heat aims to autoscale its deployments, scaling up when they're running hot and scaling back when idle. Ceilometer can access decisive data and trigger the appropriate actions in Heat. The result of these two OpenStack projects meeting is value creation in the form of an alarming API in Ceilometer and its consumption in Heat. Slides presented at the Fall OpenStack Design Summit in Hong Kong
Citation preview
Julien Danjoujd__@Freenode // @juldanjou [email protected]
Ceilometer Heat Alarming
Alarming
Nick Barcetnijaba@Freenode // @nijaba
Eoghan Glynneglynn@Freenode
ceilometer
Speakers● Nick Barcet co-founded the Ceilometer project at the
Folsom summit and led the project through incubation
● Julien Danjou has been a core Ceilometer contributor from the outset, taking over the PTL reins for Havana
● Eoghan Glynn drove the addition of the Alarming feature to Ceilometer over the Havana cycle
Two seemingly disjoint projects intersect● Heat is a template-driven orchestration engine
○ automates complex deployments via declarative configuration
● Ceilometer is a metering infrastructure○ collects data measuring resource usage and
performance
● Appear on the surface to have minimal commonality ...
Ceilometer Workflow
● Collect from OpenStack components● Transform metering data if necessary● Publish meters to any destination (including
Ceilometer itself)● Store received meters● Aggregate samples via a REST API
Collect Transform Publish Store Aggregate
Heat Workflow { "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }
my_stack.template
Heat Workflow { "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }
consumed by
Heat Engine
Heat Workflow consumed by
Heat Engineinteracts with
{ "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }
Heat Workflow { "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }
consumed by
Heat Engineinteracts with
spins upInstance Volume
my_stack
Heat Autoscaling v1.0
reports load
push-stats
CW-lite
my_stack
Heat Autoscaling v1.0
Heat Engine
my_stack
reports load scales out
stack
InstanceInstance
Instance
Heat Autoscaling v1.0
Heat Engine
my_stack
reports load scales out
stack
InstanceInstance
InstanceInstance
Ceilometer to the rescue!● compute agent already collects most
relevant stats from outside the instance● API service exposes aggregation over the
evaluation window● define new API exposing alarm lifecycle● provide new service to evaluate alarms
against their defined rules● additional service driving asynchronous
notifications when alarms fire
How it all hangs together{ "AWSTemplateFormat" : "2010-09-09", "Parameters": { "VolumeSize" : { … } } "Mappings": { "Flavor2Arch" : { "tiny": {"Arch" : "64" }, ... }, "Resources": { "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { “Volumes” : […] } } } }, "Outputs": { "DNS" : { "Value" : { … } } } }
added to template● alarms bounding busy/idleness of
instances● membership of autoscale group
represented via user metadata● alarm actions refer to scale
up/down policies● action URLs are pre-signed● policies define adjustment step size
& cooldown period
How it all hangs together"CPUAlarmHigh": {
"Type": "OS::Metering::Alarm",
"Properties": {
"meter_name": "cpu_util", threshold: "75" "evaluation_periods": "5", "period": "60", "statistic": "avg", "comparison_operator": "gt", "description": "Scale-up if CPU > 75% for 300s",
"alarm_actions":[…"ScaleUpPolicy", "AlarmUrl"…], "matching_metadata": {
"metadata.user_metadata.server_group":
"MyWebServerGroup"}}}
How it all hangs together
Heat Engine
injects user metadata
Instance
my_stack
How it all hangs together
Heat Engine
injects user metadata
Instance
my_stack
API service
creates alarms
Ceilom
eter
How it all hangs together
Heat Engine
injects user metadata
Instance
my_stack
API service
Compute Agent
creates alarms
monitors instances
Ceilom
eter
How it all hangs together
Heat Engine
injects user metadata
Instance
my_stack
API service
Compute Agent
creates alarms
Alarm evaluator
monitors instances
triggers alarm
Ceilom
eter
How it all hangs together
Heat Engine
injects user metadata
my_stack
API
Compute
Alarms
alarming
scales out stack
InstanceInstance
Instance
Ceilom
eter
How it all hangs together
Heat Engine
injects user metadata
my_stack
API
Compute
Alarms
alarming
scales out stack
InstanceInstanceInstance
InstanceInstance
Ceilom
eter
How it all hangs together
Heat Engine
my_stack
Instance
API service
Compute Agent
Alarm evaluatorreports
samples
provides alarm rules
queries statsMeter store
Ceilom
eter
Lessons learnedKeys to successful intra-project interactions:
● buy-in from stakeholders on both sides● early validation and proof-points● protect consuming project from churn during
the development cycle● split deliverables into bite-sized separately
consumable chunks
● expand metering coverage to also capture:○ memory utilization %○ LBaaS statistics○ network & disk I/O rates
● add combination alarm support to Heat templates○ allow thresholds over multiple metrics to be modeled
● exclude low-quality datapoints○ avoid scaling when only outliers have reported metrics
Future directions
● monitor baremetal via IPMI or SNMP○ autoscale groups of hosts managed as Ironic instances
● constrain alarms for time-of-day or day-of-week○ e.g. set the bar higher on weekends, lower on weekdays
● decouple autoscaling usage from Heat templates
● authenticate webhook calls with keystone trusts○ avoid ec2-signer use without keystone EC2 tokens ext
Future directions
Further questions?● Chat on Freenode:
○ #openstack-metering○ #heat
● Mail the dev list:○ [email protected]
● Harangue us via Launchpad:○ https://launchpad.net/ceilometer/+filebug