54
WAYS TO MINIMISE PERFORMANCE RISKS IN CONTINUOUS DELIVERY Adriaan Thomas 4 June 2013

Ways to minimise performance risks in continuous delivery

  • Upload
    a32an

  • View
    128

  • Download
    0

Embed Size (px)

Citation preview

Page 2: Ways to minimise performance risks in continuous delivery

INTRODUCTION

Page 3: Ways to minimise performance risks in continuous delivery

OBJECTIVEPut working software into production as quickly as possible, whilst minimising risk of load-related problems:

• Bad response times

• Lack of capacity

• Availability too low

• Excessive system resource use

Within the context of websites.

Page 4: Ways to minimise performance risks in continuous delivery
Page 5: Ways to minimise performance risks in continuous delivery

TRADITIONAL APPROACHLoad testing through simulation

http://www.flickr.com/photos/danramarch/4423023837

Page 6: Ways to minimise performance risks in continuous delivery

DECIDE WHAT TO TEST

•Focus on busiest instant•Model most-hit functionality•Extrapolate to expected load

•Look at production traffic•Or attempt educated guess

Page 7: Ways to minimise performance risks in continuous delivery

DECIDE ON SCOPE

Component test

Chain test

Full environment test•Test coverage•Level of certainty•Number of systems•Amount of work

Page 8: Ways to minimise performance risks in continuous delivery

SET UP TEST DATA

• Usually starts as a copy from production

• Or educated guess what people will enter

• Render anonymous

• Make tests deterministic

• Synchronise between all systems

http://www.flickr.com/photos/22168167@N00/3889737939/

Page 9: Ways to minimise performance risks in continuous delivery

DECIDE ON STRATEGY

One or more of:

•Scalability test

•Stress test

•Endurance test

•Regression test

•Resilience testhttp://www.flickr.com/photos/timjoyfamily/5935279962/

Page 10: Ways to minimise performance risks in continuous delivery

DECIDE ON TEST DURATION

(which is tricky)

http://www.flickr.com/photos/wwarby/3297205226

Page 11: Ways to minimise performance risks in continuous delivery

PROVIDE HARDWARE

http://www.flickr.com/photos/s_w_ellis/2681151694/

Copy of production?

Only one copy?

Virtualisation?

Sharing between teams?

Page 12: Ways to minimise performance risks in continuous delivery

INTEGRATE INTO PIPELINE

Unit testFunctional integration

testLoad test

Very fast Fast Takes longer

Page 13: Ways to minimise performance risks in continuous delivery

INTEGRATE INTO PIPELINE

Unit test

Functional integration

test

Load test

Very fast Takes longer

Page 14: Ways to minimise performance risks in continuous delivery

PERMANENT LOAD TESTING

Daytime: constant load, teams inspect impact of changes

Nighttime: Endurance test

Weekends: refresh test data

http://ww

w.flickr.com/photos/renaissancecham

bara/5106171956/

Page 15: Ways to minimise performance risks in continuous delivery

RESPONSE TIMEDNS lookup (www.xebia.com)

Time to first byte + loading HTMLTime to render

Time to document complete

Browser CPU useBandwidth

# connections to a single host

http://www.webpagetest.org/result/130522_FG_10SC/1/details/

SSL handshake

Parse times

Blocking client code

Page 16: Ways to minimise performance risks in continuous delivery

IMPACT OF THE BROWSERwww.browserscope.org

Page 17: Ways to minimise performance risks in continuous delivery

CLEAR REQUIREMENTSResponse time

Fail: 10 Now: 3.5 Goal: 1Intention: Users get a response quickly so that they are happy and spend more money.

Stakeholder: Marketing dept.

Scale: 95th percentile of “document complete” response times, in seconds, measured over one minute.

Metric: Page load times as reported by our RUM tool.

Inspired by Tom Gilb, Competitive Engineering

Page 18: Ways to minimise performance risks in continuous delivery

WebPageTest: first view + repeat view (median of 3)

95th percentile response times from access logs

ADJUST REQUIREMENTS DUE TO LACK OF REAL BROWSERS

Page 19: Ways to minimise performance risks in continuous delivery

Playground to test changesNo impact on real users

Less pressure

More work

Guesswork and extrapolationCan take a significant amount of time

More hardware

Page 20: Ways to minimise performance risks in continuous delivery

THINGS WILL BREAK...... in spite of your best efforts

http://www.flickr.com/photos/jmarty/1239950166/

Page 21: Ways to minimise performance risks in continuous delivery

SO INSTEAD WE SHOULD FOCUS ON FAST RECOVERY

http://www.flickr.com/photos/19107136@N02/8386567228/

Page 22: Ways to minimise performance risks in continuous delivery

“MTTR is more important than MTBF*”

John Allspaw

* for most types of F

Page 23: Ways to minimise performance risks in continuous delivery

0

0.5

1.0

1.5

2.0

99th

per

cent

ile re

spon

se ti

me

(s)

Test duration

MTBF LEADS TO FUD

Page 24: Ways to minimise performance risks in continuous delivery

Time→TTD find cause (RCA) write & test fix build deploy validatecom

pile

deploy & testMonitoring

Alerts

• Skills•Organisation•Culture•Maintainability• Simple architecture

•Fast w

orkstations•

Good tooling

•A

ble to quickly test locally

•A

utomation

•Fast build server•

Efficient tests

Monitoring•

Autom

ation•

Flexible architecture

TTR

Page 25: Ways to minimise performance risks in continuous delivery

DEMING FEEDBACK LOOPS

Plan

Do

Study

Act

Page 26: Ways to minimise performance risks in continuous delivery

OODA LOOPS

Observe

Orient

Decide

Act

Page 27: Ways to minimise performance risks in continuous delivery

AVOID TEST-ONLY MEASUREMENTS

Page 28: Ways to minimise performance risks in continuous delivery

SIMPLE ARCHITECTURE

Page 29: Ways to minimise performance risks in continuous delivery

THE ONLY THING THAT MATTERS IS WHAT HAPPENS IN PRODUCTION

Everything else is an assumption.

Page 30: Ways to minimise performance risks in continuous delivery

DEPLOYING CHANGES

http://www.flickr.com/photos/39463459@N08/5083733600

Page 31: Ways to minimise performance risks in continuous delivery
Page 32: Ways to minimise performance risks in continuous delivery

BLUE-GREEN DEPLOYMENTS

Version n+1

Version n

Amazon Route 53

Elastic Load

Balancer

Elastic Load

Balancer

Instances

Instances

Page 33: Ways to minimise performance risks in continuous delivery

DARK LAUNCHINGWeb page DB

Page 34: Ways to minimise performance risks in continuous delivery

DARK LAUNCHINGWeb page DB Weather SP

Page 35: Ways to minimise performance risks in continuous delivery

DARK LAUNCHINGWeb page DB Weather SP

Page 36: Ways to minimise performance risks in continuous delivery

FEATURE TOGGLES

Page 37: Ways to minimise performance risks in continuous delivery

CANARY RELEASING

0% 100%

Page 38: Ways to minimise performance risks in continuous delivery

PRODUCTION-IMMUNE SYSTEMS

Page 39: Ways to minimise performance risks in continuous delivery

CONTROLLED LOAD TESTING

Instance RDS DB Instance

RDS DB InstanceRead Replica

Instance

Instance

Amazon Route 53

Elastic Load

Balancer

Page 40: Ways to minimise performance risks in continuous delivery

MONITORING

http://www.flickr.com/photos/smieyetracking/5609671098/

Page 41: Ways to minimise performance risks in continuous delivery

MONITORINGTechnical metrics•CPU use•Memory use•TPS•Response times•etc

Process metrics•# bugs•MTTR, MTTD•Time from idea to live on site•etc

Business metrics•Revenue•# unique visitors•etc

http://www.flickr.com/photos/smieyetracking/5609671098/

Page 42: Ways to minimise performance risks in continuous delivery

MEASURE IMPACT OF CHANGES

Page 43: Ways to minimise performance risks in continuous delivery

tail  -­‐f  access_log  |  alstat.pl  -­‐i10  -­‐n10  -­‐stt

       Hits    Hits%        TPS  AvgTmTk  TTmTk%    AvgRSize  RSize%  2013-­‐06-­‐04  19:37:40  (08)            14      0.1%        1.4      1.652      5.7%            2691      0.2%  POST      200  /login.do            14      0.1%        1.4      0.918      3.2%            3739      0.3%  GET        200  /home.do            14      0.1%        1.4      0.879      3.1%            3185      0.2%  POST      200  /order.do              7      0.1%        0.7      0.807      1.4%            1974      0.1%  POST      200  /account.do              4      0.0%        0.4      0.735      0.7%            3228      0.1%  GET        200  /products.do              5      0.0%        0.5      0.697      0.9%              969      0.0%  POST      200  /settings.do              9      0.1%        0.9      0.687      1.5%            1827      0.1%  POST      200  /changeorder.do            27      0.2%        2.7      0.649      4.3%            2997      0.4%  POST      200  /newpasswd.do            15      0.1%        1.5      0.580      2.2%            2488      0.2%  GET        200  /offer.do            95      0.9%        9.5      0.520    12.2%            4801      2.3%  GET        200  /search.do

Page 44: Ways to minimise performance risks in continuous delivery

MEASURE LATENCYAvg. response times front end vs backend

Number of calls

Page 45: Ways to minimise performance risks in continuous delivery

SMALL DEPLOYMENTS

http://www.flickr.com/photos/rbulmahn/4925464931/

Page 46: Ways to minimise performance risks in continuous delivery

GO/NO-GO MEETINGS

• What are the biggest fears?

• How can we measure this?

• What can be done if it does happen?

Page 47: Ways to minimise performance risks in continuous delivery

RETROSPECTIVESHow can we prevent a failure from happening again?

How can we detect it earlier?

Was there only one root cause?

http://www.flickr.com/photos/katerha/8380451137

Page 48: Ways to minimise performance risks in continuous delivery

INTRODUCE OUTAGES

Chaos monkey

Game day exercises

http://www.flickr.com/photos/frostnova/440551442/

Page 49: Ways to minimise performance risks in continuous delivery

CULTURE

• Dev and Ops work together on providing information.

• Assumptions are dangerous, try to eliminate as many as possible.

• Small changes are easier to fix than large ones.

• Deploy during office hours so everyone is available in case problems happen.

• All information, including business metrics, should be accessible to everyone.

Page 50: Ways to minimise performance risks in continuous delivery

CLAMS

Culture

Lean

Automation

Measurement

Sharing

Page 51: Ways to minimise performance risks in continuous delivery

SIMPLE, FLEXIBLE ARCHITECTURE

• If the site goes down often, probably its architecture is at fault

• Avoid fragile systems

• Resilience is key

• Scalable (redundancy is not waste)

• Rather many small systems than a few large ones

• State is a “hot brick”

Page 52: Ways to minimise performance risks in continuous delivery

CHANGES FOR THE BUSINESS

• Accept to push smaller changes.

• Continuous delivery vs continuous deployment.

• Share data.

Page 53: Ways to minimise performance risks in continuous delivery

CONCLUSION

Work on your ability to respond to failure. Trying to prevent failure can slow you down and make you focus on the wrong things.

Keep assumptions clearly separated from facts. Make your decisions based on evidence.

Measure everything, including the impact of changes to the business.

Look for your compromise, try permanent load testing first and learn from that.

Page 54: Ways to minimise performance risks in continuous delivery

QUESTIONS?

[email protected]@a32anwww.xebia.comblog.xebia.com

(we’re hiring)