Improving DevOps through better monitoring

Preview:

DESCRIPTION

Some developers believe that monitoring is a function of operations team. Some operations teams firmly believe that monitoring the systems they maintain is sufficient to run the business successfully. Most of them are wrong. The complexity of today’s applications have gone far and beyond the capabilities of “traditional” system-level monitoring tools and approaches and requires much broader knowledge of business and application as a whole. The goal of DevOps is to connect all aspects of application development and operations, and monitoring provides visibility and troubleshooting tools to accomplish that goal. This talk is intended to provide real-world examples of common gaps in monitoring approach and explain why holistic instrumentation of business and functionality monitors should be a part of any project scope.

Citation preview

Improving DevOpsthrough better monitoring

Leon Fayer

@papa_fire

Who am I ?

• 20+ years of development and operations of large systems

• currently Vice President at OmniTI

• can be found online:

• @papa_fire

• http://fayerplay.com

• github:lfayer

So …

what is DevOps?

What is DevOps?

philosophy of collaboration

… and more

to enable business goals

Not DevOps

dev ops

Not DevOps either

devops

DevOps

devops

General consensus

Damon Edwards (http://dev2ops.org)

Missing link

Damon Edwards (http://dev2ops.org)

Finally, monitoring

enter monitoring

What to monitor?

“in God we trust

all others we monitor”

What to monitor specifically?

• systems

• databases

• application

• integration points

• performance

• user behavior

• business processes

Perfect quote

“ I don’t give a **** if the

datacenter is on fire as long as

I am still making money ”

- CEO

Example: Twitter

serves over 20 million unique visitors a day

… legendary for downtime

. servers are up and running

. HTTP checks return 200

. tweets lost

Why monitor?

• software is never perfect

• systems are more and more complex

• proactive is better than reactive

• external dependency worry

• …

Why really monitor?

things change

… and when things change

changes effect business

And now for real example

:case study:

Setting the stage

• online marketing company

• major e-commerce component

• 90+ million users

• 1 billion emails/months

• 300,000+ lines of code

• ~ 50 physical devices

• 5600+ metrics collected

It all starts with …

Let the hunt begin

revenue

Direct cause check

revenue + traffic

Going down the stack

revenue + traffic + load time

Still descending

revenue + traffic + load time + db

Got ya!

revenue + traffic + load time + db + email

Keys to monitoring

1. understand business

2. approach top-down

3. correlate data

Questions?

For more tips & examples:

http://omniti.com/explains/monitoring-the-big-picture

Recommended