53
{ } { } { } Firenze, November 17th 2015 Roberto “FRANK” Franchini @robfrankie Increase business value, measure it! What the hell is your software doing at runtime?

What the hell is your software doing at runtime?

Embed Size (px)

Citation preview

Page 1: What the hell is your software doing at runtime?

{ }

{ }

{ }

Firenze, November 17th 2015

Roberto “FRANK” Franchini

@robfrankie

Increase business value, measure it!

What the hell is your software doing at runtime?

Page 2: What the hell is your software doing at runtime?

More than 15 years of experience, proud to be a programmer

Member of OrientDB team, tech lead for the full-text, spatial, JDBC and Docker images

Wrote software for NLP and opinion mining (@scale )

Played with servers, then bought a sysadmin

JUG-Torino co-lead

2

whoami(1)

Page 3: What the hell is your software doing at runtime?

Agenda

Quotes

System monitoring

Coding

Application monitoring

All together

Feedback

Sample Scenario3

Page 4: What the hell is your software doing at runtime?

{ }

{ }

{ }

Quotes

Page 5: What the hell is your software doing at runtime?

Business value

Our code generates business value

when it runs, not when we write it.

We need to know what our code does when it runs.

We can’t do this unless we measure it.

(Codahale)

5

Page 6: What the hell is your software doing at runtime?

SLA driven

Have an SLA for your service

Measure and report performance against the SLA

(Ben Treynor, Google inc.)

6

Page 7: What the hell is your software doing at runtime?

{ }

{ }

{ }

System monitoring

Page 8: What the hell is your software doing at runtime?

Infrastructure monitoring

Sysadmins monitor infrastructure

from the beginning of IT

With right tools a single BOFH

can handle hundreds of servers

8

Page 9: What the hell is your software doing at runtime?

Tools

On premises

collectd zabbix zenoss

nagios cacti graphite/grafana

Cloud based

datadog newrelic

9

Page 10: What the hell is your software doing at runtime?

Measures

Cpu load

Network traffic

Disk I/O

Memory

More and more

10

Page 11: What the hell is your software doing at runtime?

Charts

11

Page 12: What the hell is your software doing at runtime?

Dashboard

12

Page 13: What the hell is your software doing at runtime?

Cool, black dashboard

13

Page 14: What the hell is your software doing at runtime?

{ }

{ }

{ }

Code and deploy

Page 15: What the hell is your software doing at runtime?

Write

TDD

SOLID principles

Design Patterns

Code metrics

15

Page 16: What the hell is your software doing at runtime?

Build

unit tests

integration tests

performance tests

test coverage

code quality reports

16

Page 17: What the hell is your software doing at runtime?

Deploy

Deployment pipeline

Microservices

Container

Cloud

17

Page 18: What the hell is your software doing at runtime?

Rest

All done, take your rest

Umh

I don’t think so anymore

18

Page 19: What the hell is your software doing at runtime?

{ }

{ }

{ }

Application monitoring

Page 20: What the hell is your software doing at runtime?

The day after deployment

How to monitor our service status?

How to measure it?

How it behave?

How it interact with other parts of the system?

Multiply for each µ-service

20

Page 21: What the hell is your software doing at runtime?

Monitorability

Design sw to be monitorable

Expose metrics (JMX)

Expose status (REST api)

Send metrics to monitoring tools

21

Page 22: What the hell is your software doing at runtime?

We need application monitoring

“Application monitoring? WHAT?”

“Ok, let me explain

What the app is doing right now?

How is the app performing right now?

And then graph it!”

“Ok, I got it!”

“Let me see”22

Page 23: What the hell is your software doing at runtime?

5 minutes laterpublic class PoorManJavaMetrics {

int called;

long totalTime;

public void doThings() {

final long start = System.currentTimeMillis();

//heavy business logic

called++;

final long end = System.currentTimeMillis();

final long duration = end - start;

totalTime +=duration;

}

public void logStats() {

System.out.println("---stats---");

//Here be DRAGONS

}

}

23

Page 24: What the hell is your software doing at runtime?

24Luca Franchini

Page 25: What the hell is your software doing at runtime?

Use the right tool

Use a library (e.g.: dropwizard metrics)

Count events, measure duration

Log metric values

Send application metrics

to the same backend of system metrics

25

Page 26: What the hell is your software doing at runtime?

Don’t forget naming!

A naming pattern<namespace>.<instrumented section>

.<target (noun)>.<action (past tense verb)>

Such asaccounts.authentication.password.failed

Use prefix

prod, test, dev, local

prod.accounts.authentication.password.failed

26

Page 27: What the hell is your software doing at runtime?

Which metrics?

Rate of documents processed

Latency

Transactions per second (€€€€)

Total number of errors

Meantime user interaction

27

Page 28: What the hell is your software doing at runtime?

{ }

{ }

{ }

All together now

Page 29: What the hell is your software doing at runtime?

Code on systems

Don’t cross the streams

Enable code metrics means

sysadmins and devs in the same room

talking to each other

to improve business value

29

Page 30: What the hell is your software doing at runtime?

Send

application metrics to

the same backend

of system metrics

30

Page 31: What the hell is your software doing at runtime?

Correlate application

and

system metrics

31

Page 32: What the hell is your software doing at runtime?

Repeat with me

32

Page 33: What the hell is your software doing at runtime?

Correlate application

and

system metrics

(Cross the streams!)

33

Page 34: What the hell is your software doing at runtime?

Single metrics backend

graphite

collectd

applications

grafana

34

Page 35: What the hell is your software doing at runtime?

To do what?

Discover bottlenecks

post-mortem analysis

SLA monitoring

IO impact

Network traffic

Memory utilization

35

Page 36: What the hell is your software doing at runtime?

To do what?

Why is performing better on dev laptop?

Why on customer infrastructure it takes 24h (our old test server takes 1h)?

Mechanical sympathy at large: the new service is fucking up the I/O

36

Page 37: What the hell is your software doing at runtime?

Implement THE User Story

Given the application running

when the manager comes

then I want to show a big green number

37

Page 38: What the hell is your software doing at runtime?

The answer

42

38

Page 39: What the hell is your software doing at runtime?

Application metrics dashboard

39

Page 40: What the hell is your software doing at runtime?

Get feedback

40

It’s all about feedback

Our code is talking to us

Listen to it

And take decisions

Page 41: What the hell is your software doing at runtime?

Decisions

Set new SLAs

Refactor bottleneck

Buy new hw

Expand the cloud

Drop a product

41

Page 42: What the hell is your software doing at runtime?

42

write code

deploy it

measure it

get feedback

Page 43: What the hell is your software doing at runtime?

Iterative

10 define some metrics

20 deploy

30 add other metrics

40 goto 10

Are you able to deploy every day?

43

Page 44: What the hell is your software doing at runtime?

{ }

{ }

{ }

Sample scenario

Page 45: What the hell is your software doing at runtime?

45 bare metal servers

Ngnix, Jetty, PostgreSQL

GlusterFS, Queues,

Redis, Jenkins (cron on steroids)

Infrastructure

45

Page 46: What the hell is your software doing at runtime?

Software

Java shop

deploy with Docker

More than 120 webapps

More than 100 batch jobs

NRT stream processing jobs running 24x7

46

Page 47: What the hell is your software doing at runtime?

Monitoring

collectD, graphite, grafana for system monitoring

Dropwizard Metrics inside code for application monitoring

Application metrics reported to graphite too

47

Page 48: What the hell is your software doing at runtime?

Feedback and decisions

WTF happened last night?

How is it going this morning?

Do you think we can survive the message flood?

Hey boss, it’s time to buy a new server, we are running out of resources.

48

Page 49: What the hell is your software doing at runtime?

{ }

{ }

{ }

Wrap up

Page 50: What the hell is your software doing at runtime?

Shopping list

Define your SLAs/target

Code and deploy with good practices

Code with monitorability in mind

Monitor your app/service

Correlate system and application metrics

Get feedback

Take decisions50

Page 51: What the hell is your software doing at runtime?

References

https://dropwizard.github.io/metrics/3.1.0/

https://dl.dropboxusercontent.com/u/2744222/2011-04-09-Metrics-Metrics-Everywhere.pdf

http://graphite.wikidot.com/

http://grafana.org/

http://matt.aimonetti.net/posts/2013/06/26/practical-guide-to-graphite-monitoring/

https://www.usenix.org/sites/default/files/conference/protected-

files/srecon15_slides_limoncelli.pdf51

Page 52: What the hell is your software doing at runtime?

Credits

Sketches by my sons

Andrea (Andrew) and Luca (Luke) Franchini

Cool dashboards are made with Grafana

52