Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
04.05.2018, Lars MichelsenCheck_MK Conference #4
Cloud & container monitoring
Check_MK Conference #42
Some cloud definitions
Networking
Storage
Servers
Virtualization
O/S
Middleware
Runtime
Data
Applications
Infrastructure-as-a-Service (IaaS)’Server in the cloud’Amazon EC2, Azure VM, Google CE, Amazon S3, Azure Storage
Platform-as-a-Service (PaaS)‘Runtime environment for applications’Azure App Services, AWS Elastic Beanstalk
Software-as-a-Service (SaaS)‘Applications’Office 365, Salesforce, Netflix
Check_MK Conference #43
Containers – what is it?
Infrastructure
Hypervisor
App 1
Infrastructure
OS
Containerization
Guest OS*
ContainersVirtual machines
Virtualization runs on a ‘hypervisor’
Enables portability, increases utilization of infrastructure – but resource burden
Containers use the nodes OS kernel
Enables portability at high-infrastructure efficiency
VM 1 VM 2
App 2 App 1
Guest OS*
App 2 App 1
bins|libs
Container 1
App 2
bins|libs
Container 2
* incl. Binaries and libraries
Check_MK Conference #44
What is docker on my host?
Node
Daemon
ImagesContainers
Dockerregistry
Check_MK Conference #45
From one to many, many containers
Node A
Docker Daemon
Images
Kubernetes MasterContainer orchestration
Pod
Containers
Pod
Containers
Node B
Docker Daemon
Images
Pod
Containers
Node C
Docker Daemon
Images
Pod
Containers
Check_MK Conference #46
Implications for monitoring
3
2
1
4
Fast changing environments
Containers as additional layer
Cloud-APIs (for PaaS / SaaS)
Single metrics become less relevant
Dynamic configuration
Plugins
Plugins
Aggregated metrics
0) Intro
1) Cloud monitoring
2) Container monitoring
3) Dealing with dynamics
4) Metrics
Check_MK Conference #48
IaaS
PaaS
SaaS
IaaS is standard business
Public Cloud Private Cloud
Check_MK Agent Check_MK Agent
Plannedvia APIs
Plannedvia APIs
Plannedvia APIs
Plannedvia APIs
Check_MK Conference #49
API
Monitoring PaaS & SaaS via APIs
We buildwhat is needed
Special-Agent and Checks - Working on Azure (e.g. SQL database)- Involved in several migrations (e.g. multi region Azure)- What do you need? AWS, Azure services, OpenStack, ...?
Specialagent
Checks
0) Intro
1) Cloud monitoring
2) Container monitoring
3) Dealing with dynamics
4) Metrics
Check_MK Conference #411
Basics are already possible
Existing options... … but
Need to care about configuration Checks (Docker on Check_MK Exchange)
Process checks and other resources of docker nodes
Agent in docker container
Check_MK Conference #412
… but currently developing much broader feature set
+ +Container specific
metricsDynamic
configurationAggregated
metrics
Check_MK Conference #413
Container monitoring future
Phase I Phase II Phase III
Native container support
Container orchestration
Mgmt. for container & container orchestration
1.6 1.6
Check_MK Conference #414
Docker native Check- & Inventory Plug-Ins
Phase I Phase II Phase III
Node checks: System status, #images, #containers, disk usage
Node inventory: Version, labels, networks
Image inventory: Time created, labels, size, #containers (state)
Container checks: CPU, memory, disk IO, traffic, uptime, health
Container inventory: node running on, labels, networks
Check_MK Conference #415
Agent with batteries includedPhase I Phase II Phase III
docker ext. included in
agent
veth interfaces are totally ignored
ps checkis aware of namespaces
container mounts are totally ignored
Check_MK Conference #416
How Check_MK gets docker dataPhase I Phase II Phase III
How to get the data?
Nodeagent?
Cont.agent?
Execute Nodes AgentContact via Node
Yes
No
Check_MK Conference #417
How Check_MK gets docker dataPhase I Phase II Phase III
How to get the data?
Nodeagent?
Cont.agent?
Install Agent in ImageContact via Node
Yes
Yes
Execute Nodes AgentContact via Node
No
Check_MK Conference #418
How Check_MK gets docker dataPhase I Phase II Phase III
How to get the data?
Nodeagent?
Install agent in ImageContact agent via network
Cont.agent?
Execute Nodes AgentContact via Node
Install Agent in ImageContact via Node
Yes
No
Yes
No
Check_MK Conference #419
Short demo!Phase I Phase II Phase III
Check_MK Conference #420
Monitoring of container orchestration tool Kubernetes
Phase I Phase II Phase III
Pod deployments: Current vs. available Pods
Node: Resource request vs. limits
Running pods: Per node, per replica set
Check_MK Conference #421
Monitoring of management tools for container orchestration
Phase I Phase II Phase III
Only idea stage yet – your feedback is welcome
Check_MK Conference #422
... and then ... Continuously improve Phase I-III
Phase I / II / III
Support tools, e.g. Docker Swarm
More plugins
0) Intro
1) Cloud monitoring
2) Container monitoring
3) Dealing with dynamics
4) Metrics
Check_MK Conference #424
Requires highly dynamic configuration
Why? Dynamic configuration (DCD)
Focus on containers Nodes & Kubernetes later Enterprise Edition only
Volatile environment Monitoring configuration thus
needs to adapt very dynamically
Check_MK Conference #425
Dynamic Configuration Daemon (DCD) architecture
Check_MK Agent with Docker Plug-in
Kubernetes API
Cloud APIs (e.g. AWS)
DCD
Connector: Docker “unmanaged”
Connector: Kubernetes
Check_MK
WATO-API
Check_MK Base
Core
Connector: AWS ...
WATO
Sources Central site Remote sites
Check_MK
WATO-API
Check_MK Base
Core
...
WATO
Check_MK Conference #426
Also useful beyond containers
Virtual machines: Ask vCenter for VMs
LDAP: Ask for users or hosts
Network scan: Ask the network for hosts
0) Intro
1) Cloud monitoring
2) Container monitoring
3) Dealing with dynamics
4) Metrics
Check_MK Conference #428
Static containers: Standard metrics
Business as usual
“well known”hosts
Static containers
Check_MK Conference #429
Dynamic containers:Which metrics are relevant?
Total CPU load
Aggregated metricsAverages or total values over all containers
Understand performance & trends
Dynamic pool of containers
Check_MK Conference #430
Dynamic containers:Which metrics are relevant?
Total CPU load
Understand faults
Individual CPU load
Dynamic pool of containers
Individual metricsValue per container
Check_MK Conference #431
What we will do with these metrics
Aggregated metrics Container metrics Volatility per container Defines resolution Defines lifetime of metrics
Collect with regular monitoring Independent of individual hosts Not limited to containers Make configurable via GUI
Join us!Getting ready for a new world.