22
Keeping your Cloud Footprint in Check Coburn Watson

Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Keeping your Cloud Footprint in Check

Coburn Watson

Page 2: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&
Page 3: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

@coburnw

•  Cloud  Performance  and  Reliability  @  Ne6lix  –  Reduce  TTD  and  TTR  –  Build  innova<ve  performance  analysis  tooling  –  Op<mize  usage  of  AWS  Cloud  –  Steer  global  user  traffic  and  support  failover  –  Inject  Chaos  into  produc<on  environment  –  Drive  opera<onal  best  prac<ce  adop<on    

Page 4: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

•  67M+  Subscribers  •  >  50  countries  •  >  3  billion  hours  of  video  streamed  monthly  •  Huge  cloud  footprint  •  Homegrown  CDN  •  Strong  Originals  slate  

Page 5: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

•  Strong  focus  on  open  source  efforts  •  hTps://ne6lix.github.io/  

Atlas  

Page 6: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Our  Priori<es  

Page 7: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

(me)  

Innova<on  

Reliability  

Efficiency  

Page 8: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Cost  of  Innova<on  and  Reliability  

Page 9: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Maximize Innovation

•  Capacity  On-­‐Demand  •  Commit-­‐to-­‐Cloud  in  minutes  •  Single  Produc<on  Account  (~  350  µservices)  •  Burst  into  on-­‐demand,  cover  with  reserva<on  purchases  

Page 10: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Cost of Reliability

•  Red-­‐Black  push  model  •  Over-­‐provision  for  redundancy  in  AWS  Region  •  Global  redundancy  through  failover    •  Purchase  “Heavy”  AWS  EC2reserva<ons  to  secure  capacity  

Page 11: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Efficiency  

Page 12: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Efficiency Goals

•  Have  them  and  track  them!  

Page 13: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Monitoring Costs

•  ICE:  Open  Source  AWS  Cost  Monitoring  U<lity  •  Internal  Cost  Repor<ng  pushed  to  first-­‐level  managers  

Page 14: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Maximize Sharing

•  Single  Produc<on  Account  •  Fewer/Larger  Pools  •  Maximize  Shared  Capacity  

>  75%  in  only  8  EC2  Instance  Types  

Page 15: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Encourage Borrowing

•  All  accounts  are  linked  at  a  billing  level  •  Large  troughs  of  unused  capacity  exist  (Autoscaling)  •  Interrup<ble  workloads  for  internal  “Spot”  

Page 16: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Optimization

•  Direct  Consulta<on  for  “Big  Fish”  •  Tooling  for  Everyone  

1  •  Develop  

2  •  Deploy  •  Scale  

3  •  Op<mize  (if  needed)  

New  Services  or  Features   Ongoing  Service  Development  

1  •  Develop  

2  •  Canary  

3  •  Op<mize  (if  needed)  •  Deploy  

Page 17: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Improving Stack Observability

•  Too  big  for  commercial  tools  •  Patch  key  middleware  where  necessary  

Mixed-­‐Mode  JVM  CPU  Flame  Graph  Transac<on  Tracing  with  Resource  Demand  

Page 18: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Monitor Capacity Shortfalls

•  Constrain  On-­‐Demand  charges  •  Iden<fy/alert  on  significant  capacity  provisioning  events  

Page 19: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Data Points

•  Internal  Borrowing  •  Encoding  consumed  135k  cross-­‐account  EC2  Instance  hours  June  2015  (>  ~  $200k/monthly  savings)    

•  Data  Pla6orm  (Hadoop,  etc.)  saves  >  $1MM/year  

Page 20: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Summary

•  Target  your  Innova<on:Efficiency  ra<o    

•  Push  cost  context  to  the  team  level    •  Embrace  the  elas<city  of  the  Cloud    

Page 21: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&
Page 22: Keeping your Cloud Footprint in Check - GOTO Conferencegotocon.com/dl/goto-london-2015/slides/CoburnWatson... · 2015-09-17 · @coburnw • Cloud&Performance&and&Reliability&@Ne6lix&

Thanks !