Deploy Microservices in the Real World

Preview:

Citation preview

Deploying Microservices

Deploying Microservices Russell Perkins5/4/2017

About KenzanCore offeringsApplication Development, Platform as a service, cloud virtualization, platform engineering, consulting services and business transformation.

Primary ClientsMulti billion dollar companies and media/content providers such as Thompson Reuters, Charter & Cablevision

LocationsProvidence (RI), New York (NY), Denver (CO), Los Angeles (CA), and a London presence

Founded in 2004.

We are a software engineering and digital consulting firm that has been helping clients Make Next Possible for over a decade:

Full Service Consulting FirmArchitecture, front and back end development, business analysis and DevTest.

Cloud Virtualization Experts And EnablersAWS, Netflix stack, enterprise architecture and beyond.

DevOps LeadershipPlatform builds, continuous delivery and scalable resourcing.

Veterans of the Media IndustryMigrations, enterprise wide solutions, digital experts and thought leaders.

Employee focused Collaboration, communication and culture are key.

Agenda

● What is CI/CD

● Deployment types

● On prem physical servers

● Calculating health against SLO’s and SLA’s

● Canary Deployments

● Common causes of outages

Continuous IntegrationContinuous Deployments

Of course..But how?

Deployment Pipelines

Central code repositoryAutomated builds

Self-testingAutomated deployment

Deployment PipelinesSimple

Git Push Unit Tests Elastic Beanstalk

Deployment PipelinesComplex

Unit Tests Integration Tests

End to End Tests

Stress Tests

Test AWS account

Git Push

Stable AWS

accountManual Judgment Production

Cattle Not Pets

Pets:Servers or server pairs that are treated as indispensable or unique systems that can never be down. Typically they are manually built, managed, and “hand fed”.

Cattle:Arrays of more than two servers, that are built using automated tools, and are designed for failure, where no one, two, or even three servers are irreplaceable. Typically, during failure events no human intervention is required as the array exhibits attributes of “routing around failures” by restarting failed servers or replicating data through strategies like triple replication or erasure coding.

Types of Deployments

Rolling Deployment

Red / Black DeploymentA/Z, Blue/Green

On-Premise Physical Servers

Cattle Not Pets(again)

Seriously

Kubernetes

Cloud Bursting

Cloud bursting is an application deployment model in which an application runs in a private cloud or data center and

bursts into a public cloud when the demand for computing capacity spikes.

Hybrid Cloud Models

Pilot Light

Design:● Images (AMI’s, containers, ect) are copied to the cloud● Auto Scale Groups update to use latest image● Cluster sizes set to 0

Benefits:● Low overhead costs● Can be activated fairly quickly.

Warm Standby

Design:● Images (AMI’s, containers, ect) are copied to the cloud● Auto Scale Groups update to use latest image● Cluster sizes set to a reasonable number, no less than 2.

Benefits:● Can be activated instantly

Multi-Site

Design:● Images (AMI’s, containers, ect) are copied to the cloud● Auto Scale Groups update to use latest image● Cluster sizes set to a reasonable number, no less than

2.● Some traffic is always directed to the cloud servers.

Benefits:● Always active● Far away regions can use the cloud for reduced latency.

Uptime with SLO’s and SLA’s

Service Level Objectives&

Service Level Agreements

SLO:SLOs are specific measurable characteristics of the SLA such as availability, throughput, frequency, response time, or quality.

SLA:The SLA is the entire agreement that specifies what service is to be provided, how it is supported, times, locations, costs, performance, and responsibilities of the parties involved.

Uptime and Automation

99.9% 8 hrs 45 mins

99.99% 52 mins

99.999% 5 mins

Uptime Percentage Acceptable yearly outages

Traditional Uptime Monitoring

Monitoring via logs

Logging Tools:● AWS CloudWatch● GCP StackDriver

logging● Graylog● ELK Stack

Metrics to track:● HTTP status codes● CPU Usage● Memory● DiskSpace● Network

Canary Deployments

Slow is better

We want to make sure our software works in the real world.

But...

Users are both predictable and unpredictable Different regions and devices may behave differentlySome issues (memory leaks) only appear overtime.

Canary Watcher

Simple script run every 10 mins and monitors health / logs.

Keeps track of the deployment state (10%, 50%, ect)

Automatically remove from LB if an SLO is missed.

Can be run ad hoc

Real Time

Applications can be self aware.

Alerts can trigger removal from a LB or auto rollback

Common Causes of Outages

Common Causes of Outages

● Overload● Retry Spikes● Pets● Monitoring Gaps● Scaling Boundaries● Bad Configuration● Lengthy Startup Times

Want to learn more?Follow us!

@kenzanmedia

www.linkedin.com/company/kenzan-media

techblog.kenzan.com

www.facebook.com/kenzanmedia/

Recommended