StatsCraftStatsCraftMonitoring ConferenceMonitoring Conference
website and agenda: twitter: (#statscraft)facebook: email:
http://statscraft.org.il@statscraft
https://www.facebook.com/[email protected]
AgendaAgenda1. Understand the problem.2. Understand what monitoring is.3. Example use-case(s)4. A different approach5. Learn methodologies and tools
WeWemonitor because...monitor because...
We want to satify theWe want to satify thecustomer.customer.
(make money?)
Automated Resource ProvisioningConfiguration ManagementAutomated Code DeploymentContinuous WhateverMonitoring
Still underrated...Still underrated...Automated Resource ProvisioningConfiguration ManagementAutomated Code DeploymentContinuous WhateverMonitoring
PROBLEM!PROBLEM!
We're monitoringWe're monitoringthe wrong things.the wrong things.
_rootCauseAnalysis:
the alternative is harder.
We're consideringWe're consideringlogs a second classlogs a second class
citizen.citizen.
_rootCauseAnalysis:
the alternative is harder.
Our data is lacking.Our data is lacking.
_rootCauseAnalysis:
inertia. that's how it was, that's how it is.
We separateWe separatemonitoring frommonitoring from
applicationapplication
_rootCauseAnalysis:
we're not used to this. (Ops problem)
We monitorWe monitorreactively, notreactively, not
proactivelyproactively
_rootCauseAnalysis:
reaction requires less initial energy than anticipation.
We put uptimeWe put uptimeabove system andabove system and
product qualityproduct quality
_rootCauseAnalysis:
it's much easier.
We deal with hardWe deal with hardlimits.limits.
_rootCauseAnalysis:
arbitrary numbers are easier to set.
Monitoring is non-Monitoring is non-functional butfunctional but
resource hungryresource hungry
_rootCauseAnalysis:
we just don't accept it.
Good monitoringGood monitoringrequires the rightrequires the right
people, not just Ops!people, not just Ops!
_rootCauseAnalysis:
delegation is natural. other have more important things to do.
Alert fatigue isAlert fatigue iscommon.common.
_rootCauseAnalysis:
solving issues is much easier than solving problems, and apparently, we are additted to non-actionable alerts.
We're auto-scalingWe're auto-scalingprematurelyprematurely
_rootCauseAnalysis:
brute force is natural
We're choosing theWe're choosing thewrong tools.wrong tools.
_rootCauseAnalysis:
it's easier to choose the tool than to choose what to monitor.
Good monitoringGood monitoringis hardis hard
_rootCauseAnalysis:
systems become complex, so they're harder to monitor.
So, after all, why do weSo, after all, why do wenot monitor properly?not monitor properly?
1. SimplificationSimplification2. DelegationDelegation3. RationalizationRationalization
_rootCauseAnalysis:
No fear,No fear,
Let's see how we can makeLet's see how we can make
this all betterthis all better
is here!is here!