43
@pas256 https://cloudnative.io/ Auto Scaling Groups Advanced AWS meetup Peter Sankauskas Founder of CloudNative @pas256

Auto Scaling Groups

Embed Size (px)

Citation preview

Page 1: Auto Scaling Groups

@pas256https://cloudnative.io/

Auto Scaling GroupsAdvanced AWS meetup

!

!

!

Peter Sankauskas Founder of CloudNative

@pas256

Page 2: Auto Scaling Groups

@pas256https://cloudnative.io/

Daily lifeMore users

Higher costsMore logsMore data

New engineers

More instances

Increased deployment frequency

Reduce costs Eliminate deployment risksBoss

Deadline

Page 3: Auto Scaling Groups

@pas256https://cloudnative.io/

Your GoalSleep

ReliableSocial life

Sleep

UptimeTime with family

Sleep

Page 4: Auto Scaling Groups

@pas256https://cloudnative.io/

– PagerDuty

“Don’t hate the Pager, hate the game”

Page 5: Auto Scaling Groups

@pas256https://cloudnative.io/

Old world

Inst

ance

s ru

nnin

g

0

2

4

7

9

11

Used Capacity

70% Wasted

Page 6: Auto Scaling Groups

@pas256https://cloudnative.io/

Auto Scaling Group

• Your assistant in the cloud

• First level support

• Automation

0

2

4

7

9

11

Used Capacity

Page 7: Auto Scaling Groups

@pas256https://cloudnative.io/

Auto Scaling Group• Capacity: minimum, maximum, desired

• Access: ELB

• Polices

• Where:

• Availability Zones

• VPC Subnets

ASG Launch Config

Scaling PolicyScaling Policy

Scaling PolicyScheduled

Action

Scheduled Action

Scheduled Action

Page 8: Auto Scaling Groups

@pas256https://cloudnative.io/

{! "Type" : "AWS::AutoScaling::AutoScalingGroup",! "Properties" : {! "AvailabilityZones": [ String, ... ],! "Cooldown": String,! "DesiredCapacity": String,! "HealthCheckGracePeriod": Integer,! "HealthCheckType": String,! “LaunchConfigurationName": String,! "LoadBalancerNames": [ String, ... ],! "MaxSize": String,! "MetricsCollection": [ MetricsCollection, ... ]! "MinSize": String,! “NotificationConfiguration": NotificationConfiguration,! "PlacementGroup": String,! "Tags": [ Auto Scaling Tag, ... ],! “TerminationPolicies": [ String, ... ],! "VPCZoneIdentifier": [ String, ... ]! }!}

Page 9: Auto Scaling Groups

@pas256https://cloudnative.io/

Launch Configuration• Every ASG needs a Launch Configuration

• Describes what an individual EC2 instance looks like

• AMI

• Instance type

• Security groups

Page 10: Auto Scaling Groups

@pas256https://cloudnative.io/

{! "Type" : "AWS::AutoScaling::LaunchConfiguration",! "Properties" : {! "AssociatePublicIpAddress": Boolean,! "BlockDeviceMappings": [ BlockDeviceMapping, ... ],! "EbsOptimized": Boolean,! "IamInstanceProfile": String,! "ImageId": String,! "InstanceMonitoring": Boolean,! "InstanceType": String,! "KernelId": String,! "KeyName": String,! "RamDiskId": String,! "SecurityGroups": [ SecurityGroup, ... ],! "SpotPrice": String,! "UserData": String! }!}

Page 11: Auto Scaling Groups

@pas256https://cloudnative.io/

Scaling Plans

1. Fixed

2. Manual

3. Scheduled

4. Dynamic

Page 12: Auto Scaling Groups

@pas256https://cloudnative.io/

Fixed• Ensure a fixed number of instances is always running

• Set MinSize = MaxSize

• Examples

• Any “master” service

• Zookeeper - 3 nodes across 3 AZs

• Cassandra0

1

2

3

Used Capacity

Page 13: Auto Scaling Groups

@pas256https://cloudnative.io/

# One Asgard instance - troposphere example!launchConfig = t.add_resource(asg.LaunchConfiguration("launchConf",! AssociatePublicIpAddress=True,! IamInstanceProfile=Ref(asgardInstanceProfile),! ImageId=FindInMap("AWSRegion2AMI", Ref("AWS::Region"), "AMI"),! InstanceType="m3.medium",! KeyName="admin",! SecurityGroups=[Ref(asgardInstanceSecurityGroup)],!))!!

asgardASG = t.add_resource(asg.AutoScalingGroup("asgardASG",! Tags=[asg.Tag("Name", "Asgard", True)],! Cooldown="120",! MinSize="1",! MaxSize="1",! AvailabilityZones=["us-west-2a","us-west-2b"],! VPCZoneIdentifier=["subnet-c46c6982","subnet-8133f6e4"],! LaunchConfigurationName=Ref(asgardLaunchConfig),!))

Page 14: Auto Scaling Groups

@pas256https://cloudnative.io/

Manual Scaling

• Use API to change capacity on demand

SetDesiredCapacity!

• AutoScalingGroupName = my-asg

• DesiredCapacity = 20

1

2

Used Capacity0

1

2

Page 15: Auto Scaling Groups

@pas256https://cloudnative.io/

Scheduled

• At this time, set capacity to X

• Each ScheduledAction must have a unique start time

• Guaranteed order of execution within same ASG

0

2

4

7

9

11

Used Capacity

Page 16: Auto Scaling Groups

@pas256https://cloudnative.io/

Specific date and timePutScheduledUpdateGroupAction!

• ScheduledActionName = ScaleOut

• AutoScalingGroupName = my-asg

• DesiredCapacity = 3

• StartTime = “2013-05-12T08:00:00Z”

Page 17: Auto Scaling Groups

@pas256https://cloudnative.io/

Recurring schedulePutScheduledUpdateGroupAction!

• ScheduledActionName = Scaleout-schedule-year

• AutoScalingGroupName = my-asg

• DesiredCapacity = 3

• Recurrence = “30 0 1 1,6,12 0”

Page 18: Auto Scaling Groups

@pas256https://cloudnative.io/

Dynamic Scaling

• Best Utilization

• Lowest Cost

0

2

4

7

9

11

Used Capacity

Page 19: Auto Scaling Groups

@pas256https://cloudnative.io/

Trigger: CloudWatch Alarm• Metrics

• CPU Utilization

• Network in/out

• Size of queue (SQS)

• Anything you put into CloudWatch

• Set the Alarm Action to the ARN of the ScalingPolicy

Page 20: Auto Scaling Groups

@pas256https://cloudnative.io/

Action: ScalingPolicy• Adjustment Types

• Change by number

• E.g. Scale Out: Add 2 more instances

• E.g. Scale In: Remove 1 instances

• Exact

• E.g. Scale Out: Have exactly 8 instances

• Percentage

• E.g. Scale Out: Add 25% more instances

Page 21: Auto Scaling Groups

@pas256https://cloudnative.io/

Cooldown

• After a ScalingPolicy has been fired, wait X seconds before performing any other actions.

• Manual Scaling: SetDesiredCapacity

• HonorCoolDown = True/False

Page 22: Auto Scaling Groups

@pas256https://cloudnative.io/

Load Balancing

• Put an ELB in front of the instance in your ASG

• Set when creating the ASG

• Zero effort in adding and removing instances

• Additional health check options

Page 23: Auto Scaling Groups

@pas256https://cloudnative.io/

Health Checks• By default, ASG uses EC2 Status Checks

• If you have an ELB, you can use the same ELB health checks

• HTTP:80/healthcheck!

• HTTP 200 response is the only thing that is considered healthy

• E.g. Return something else while app is loading filled

Page 24: Auto Scaling Groups

@pas256https://cloudnative.io/

Termination Policy

• OldestInstance

• NewestInstance

• OldestLaunchConfiguration

• ClosestToNextInstanceHour

Page 25: Auto Scaling Groups

@pas256https://cloudnative.io/

Page 26: Auto Scaling Groups

@pas256https://cloudnative.io/

Requirements for Dynamic Scaling• Stateless application

• Configuration must be 100% automated

• Tools understand dynamic environments

• Config management

• Monitoring

• Log aggregation

Page 27: Auto Scaling Groups

@pas256https://cloudnative.io/

Page 28: Auto Scaling Groups

@pas256https://cloudnative.io/

Migration

• Create an ASG or LaunchConfiguration from an already running instance

• Put that instance in the ASG

Page 29: Auto Scaling Groups

@pas256https://cloudnative.io/

{! "Type" : "AWS::AutoScaling::AutoScalingGroup",! "Properties" : {! "AvailabilityZones" : [ String, ... ],! "Cooldown" : String,! "DesiredCapacity" : String,! "HealthCheckGracePeriod" : Integer,! "HealthCheckType" : String,! "InstanceId" : String,! "LaunchConfigurationName" : String,! "LoadBalancerNames" : [ String, ... ],! "MaxSize" : String,! "MetricsCollection" : [ MetricsCollection, ... ]! "MinSize" : String,! "NotificationConfiguration" : NotificationConfiguration,! "PlacementGroup" : String,! "Tags" : [ Auto Scaling Tag, ... ],! "TerminationPolicies" : [ String, ... ],! "VPCZoneIdentifier" : [ String, ... ]! }!}

Page 30: Auto Scaling Groups

@pas256https://cloudnative.io/

{! "Type" : "AWS::AutoScaling::LaunchConfiguration",! "Properties" : {! "AssociatePublicIpAddress" : Boolean,! "BlockDeviceMappings" : [ BlockDeviceMapping, ... ],! "EbsOptimized" : Boolean,! "IamInstanceProfile" : String,! "ImageId" : String,! "InstanceId" : String,! "InstanceMonitoring" : Boolean,! "InstanceType" : String,! "KernelId" : String,! "KeyName" : String,! "RamDiskId" : String,! "SecurityGroups" : [ SecurityGroup, ... ],! "SpotPrice" : String,! "UserData" : String! }!}

Page 31: Auto Scaling Groups

@pas256https://cloudnative.io/

# Instance Configuration - Self healing NAT - troposphere!natLaunchConfig = t.add_resource(asg.LaunchConfiguration(! "natLaunchConfig",! AssociatePublicIpAddress=True,! InstanceType="t1.micro",! ImageId="ami-f032acc0",! SecurityGroups=[Ref(natSecurityGroup)],! IamInstanceProfile=Ref(natInstanceProfile),! UserData=Base64(Join("\n", [! "#!/bin/bash",! "yum update -y",! "instanceId=`/opt/aws/bin/ec2-metadata -i | cut -f2 -d' '`",! "region=`/opt/aws/bin/ec2-metadata -z | cut -f2 -d' ' | sed '$s/.$//'`",! "vpcId=`aws ec2 describe-instances --instance-ids $instanceId --region $region --query 'Reservations[*].Instances[*].VpcId' --output text`",! """rtbId=`aws ec2 describe-route-tables --region $region --filters "[{\\"Name\\":\\"vpc-id\\",\\"Values\\":[\\"$vpcId\\"]},{\\"Name\\":\\"association.main\\",\\"Values\\":[\\"true\\"]}]" --query RouteTables[*].RouteTableId --output text`""",! """aws ec2 modify-instance-attribute --instance-id $instanceId --source-dest-check '{"Value": false}' --region $region --output table""",! "aws ec2 replace-route --route-table-id $rtbId --destination-cidr-block 0.0.0.0/0 --instance-id $instanceId --region $region --output table",! "aws ec2 create-route --route-table-id $rtbId --destination-cidr-block 0.0.0.0/0 --instance-id $instanceId --region $region --output table"! ]))!))

Page 32: Auto Scaling Groups

@pas256https://cloudnative.io/

UserData and cloud-init• Inside LaunchConfiguration

• Set UserData script to be run by cloud-init

• If you are using Chef, this is what you will do

• More details:

• Watch Episode #4 on Answers for AWS

Page 33: Auto Scaling Groups

@pas256https://cloudnative.io/

Baking AMIs• Raw: Do everything on boot

• Fully Baked: Immutable infrastructure

• Half-Baked: Anything in-between

!

http://answersforaws.com/blog/2013/11/half-baked/

Page 34: Auto Scaling Groups

@pas256https://cloudnative.io/

Deploy Changes• Option 1: Change AMI or User Data in LaunchConfiguration

• NOTE: This has no immediate outcome

• Only affects newly launched instances

• Revisit TerminatePolicy

• You need to terminate existing instances so that new ones come up with the changes

Page 35: Auto Scaling Groups

@pas256https://cloudnative.io/

Deploy Changes• Option 2: Create a completely new stack

• Use CloudFormation (or whatever) to create a new ASG, LaunchConfig, ScalingPolicies, ELB, Security Group, VPC, Subnets, etc

• Overkill

• If you have high traffic, the new ELB will not be pre-scaled and will not handle the load

• Need to contact AWS TAM

Page 36: Auto Scaling Groups

@pas256https://cloudnative.io/

Blue/Green DeploymentOr is a red/black deployment… or is it A/B deployment?

• Option 3:

• Reuse existing infrastructure including the same ELB

• Create a new ASG and LaunchConfig

• Switch traffic at the ELB from old ASG to new ASG

Page 37: Auto Scaling Groups

@pas256https://cloudnative.io/

Demo

Page 38: Auto Scaling Groups

@pas256https://cloudnative.io/

– Peter Sankauskas… just now

“It’s not about how fast you can deploy, it is about how fast you can rollback”

Page 39: Auto Scaling Groups

@pas256https://cloudnative.io/

Canary Deployment• Very similar to blue/green deployment

• New ASG and LaunchConfig

• Add traffic to only 1 instance in the new ASG

• Then 2 instance

• Up to 100%

• Both versions running side by side

• Roll off traffic from old ASG instances

Page 40: Auto Scaling Groups

@pas256https://cloudnative.io/

Running multiple version• DB Schema changes are on a different schedule to code

deployments

• mcfunley (Etsy): “We deploy schema changes once per week. The code always works against both versions of the schema. We never take downtime for schema changes. We avoid data loss by doing soft deletes as much as we can.”

• Deploy features dark

• Use Feature Flags

Page 41: Auto Scaling Groups

@pas256https://cloudnative.io/

Tools• Baking AMIs

• Packer - Hashicorp

• Aminator - Netflix

• CloudNative

• Deployment

• Asgard - Netflix

• CloudNative

Page 42: Auto Scaling Groups

@pas256https://cloudnative.io/

New World• Automation expert

• Stateless, independently scalable apps

• Allergic to manual labor

• Embrace your laziness

• Auto Scaling Groups provide:

• Zero-effort scaling

• Fault-tolerance

• Increase reliability & uptime

• Decrease cost

Page 43: Auto Scaling Groups

@pas256https://cloudnative.io/

Sleep