AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your EC2 Bill

Preview:

Citation preview

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Boyd McGeachie

October 28th, 2015

Using Spot Instances to Save up to 90% off Your EC2 Bill

Spare capacity at scale

AWS has more than a million active customers in 190 countries.

Amazon EC2 instance usage has increased 93% YoY, comparing Q4 2014 and Q4 2013, not including Amazon use.

“By using AWS Spot instances, we've been able to save 75% a month simply by changing four lines of code. It makes perfect sense for saving money when you're running continuous integration workloads or pipeline processing.” - Matthew Leventi, Lead Engineer, Lyft

Why use Spot?

39 years of drug research re-processed, using over 80,000 cores, in 9 hours for $4,232 Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot Instances

$1

With Spot the rules are simple

Amazon EC2 Spot – in the wild

1) We make this easy using the Spot bid advisor

2) With deliberate pool selection and bidding, you will keep your Spot instance as long as you need to.

3) And with new features like Spot fleet diversified we do the heavy lifting for you...

$0.27 $0.29$0.50

1b 1c1a

8XL

$0.30 $0.16$0.214XL

$0.07 $0.08$0.082XL

$0.05 $0.04$0.04XL

$0.01 $0.04$0.01L

C3

$1.76

OnDemand

$0.88

$0.44

$.22

$0.11

Show me the markets!

Each instance family

Each instance size

Each Availability Zone

In every region

Is a separate Spot Market

50% Bid

75% Bid

You pay the market price

Bid Price Vs Market Price

25% Bid

And now…

Spot fleet helps you

Launch Thousands of Spot Instanceswith one RequestSpotFleet call.

Get Best PriceFind the lowest priced horsepower that works for you.

or

Get Diversified ResourcesDiversify your fleet. Grow your availability.

and

Apply Custom WeightingCreate your own capacity unit based on your application needs

EC2 Spot fleet

It is easy! aws ec2 request-spot-fleet --spot-fleet-request-config file://config.json { "IamFleetRole": "arn:aws:iam::781603563322:role/fleet-role", "TargetCapacity": "100", "SpotPrice": "0.03", "ValidFrom": "2015-09-15T00:56:19Z", "ValidUntil": "2016-09-14T07:00:00Z", "TerminateInstancesWithExpiration": true, "LaunchSpecifications": [ { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.large", "WeightedCapacity": 2, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.large", "WeightedCapacity": 2, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.large", "WeightedCapacity": 2, "SubnetId": "subnet-0b1b8052" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.xlarge", "WeightedCapacity": 4, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.xlarge", "WeightedCapacity": 4, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.xlarge", "WeightedCapacity": 4, "SubnetId": "subnet-0b1b8052" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SubnetId": "subnet-0b1b8052" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge", "WeightedCapacity": 32, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge", "WeightedCapacity": 32, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge", "WeightedCapacity": 32, "SubnetId": "subnet-0b1b8052" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.2xlarge", "WeightedCapacity": 8, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.2xlarge", "WeightedCapacity": 8, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.2xlarge", "WeightedCapacity": 8, "SubnetId": "subnet-0b1b8052" } ] }

An easy to use interface that lets you launch spare EC2 instances in seconds

Helps you select and bid on the EC2 instances that meet your applications requirements

Simple to use dashboard lets you modify and manage your application’s compute capacity

EC2 Spot Console

Using a single additional Parameter

Run continuously for up to 6 hours

Save up to 50% off On-Demand pricing

EC2 Spot Blocks

$1

What’s in 6 hours?

~ 21% less than 1 hour

~ 35% less than 2 hours

~ 40% less than 3 hours

In total roughly 50% of all instances live less than 6 hours

Lets see EC2 Spot in action..

Best Practices

Hadoop

Stateless Applications (e.g. web tiers)

Batch processing

EC2 Best practices

Fault tolerance

for Spot

Stateless Multi-AZ Loosely coupled

Instance Flexibility

EC2 Spot – Hadoop

Core nodes

Master Node

Master instance group

Hadoop cluster

Core instance group

HDFS HDFS

DataNode (HDFS)

Core nodes

Master Node

Master instance group

Hadoop cluster

Core instance group

HDFS HDFS

Can Add Core Nodes:

More CPU

More Memory

More HDFS Space

HDFS

Task nodes – Spot the opportunity

Master Node

Hadoop cluster

HDFS HDFS

No HDFS

Provides compute resources:

CPU

Memory

Core instance group Task instance group

Task Nodes – Multiple Instance Types

Master Node

Hadoop cluster

HDFS HDFS

Can add and remove task nodes

c3.8xl, r3.8xl, r3.4xl, etc

The opportunity

Core instance group

Multiple capacity pools

How flexible are you?

2. Across Families

1. Within family

Review and launch{ "AllocationStrategy": "diversified", "TargetCapacity": 1000, "SpotPrice": "0.005", "TerminateInstancesWithExpiration": true, "LaunchSpecifications": [ { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.xlarge", "WeightedCapacity": 4, "SpotPrice": "0.0263", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.2xlarge", "WeightedCapacity": 8, "SpotPrice": "0.0263", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0263", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge", "WeightedCapacity": 32, "SpotPrice": "0.0263", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "r3.xlarge", "WeightedCapacity": 4, "SpotPrice": "0.0438", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "r3.2xlarge", "WeightedCapacity": 8, "SpotPrice": "0.0438", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "r3.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0438", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "r3.8xlarge", "WeightedCapacity": 32, "SpotPrice": "0.0438", "SubnetId": "subnet-d0dc51fb" } ]}

Results - HadoopRequested 1000 vCores over 30 days

Minimum 848 vCoresMode 1008 vCoresAverage 1005 vCores

Average Price of $0.0118 per vCore

Savings of over 81%

But what about HDFS?

Master Node

Hadoop cluster

HDFS HDFS

Can add and remove task nodes

CORE TASK

cc2.8xl, r3.8xl, d2.4xl, etc

Spot Blocks? Use EMR/S3?

• No need to scale HDFS – Capacity – Replication for durability

• Amazon S3 scales with your data– Both in IOPs and data storage – Massively parallel

EMRFS - Amazon S3 as HDFS

Spot blocks for HDFS

• If HDFS cluster lives for less than 6 hours

Hadoop on EC2 Spot – takeaways

Your Work

Run task nodes separately with EC2 Spot fleet

Consider Spot blocks for core/HDFS nodes

What EC2 Spot fleet does for you

Saves you money

Heterogeneous instance management

Scale on the unit that matters to you

Accelerate results (time is money)

Web Applications with Spot

Stateless Web Application

Elastic LoadBalancing

Stateless Web Servers

(Spot)

Stateless Web Servers

(Spot)

Session State Data

Spot fleet

Availability Zone A

Availability Zone B

Stateless Web Servers

(Spot)

Stateless Web Servers

(Spot)

Diversification with EC2 Spot fleet

Multiple EC2 Spot instances selected

Multiple Availability Zones selected

Pick the instances with similar performance characteristics e.g. c3.large, m3.large, m4.large, r3.large, c4.large.

Multiple capacity pools

How flexible are you?

2. Across Families

3. Across Zones

Review and Launch{ "AllocationStrategy": "diversified", "TargetCapacity": 50, "SpotPrice": "0.01",

"LaunchSpecifications": [ { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.large", "SpotPrice": "0.105" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c4.large", "SpotPrice": "0.11" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "m3.large", "SpotPrice": "0.133" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "m4.large", "SpotPrice": "0.126" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "r3.large", "SpotPrice": "0.175" } ]}

Results - Web Application50 instances requested, over 30 days.

- Never dropped below 45 instances

- 85% discount if you wanted 50 and could withstand dropping to 45

- If you only wanted 45 the discount is still 83%

Some additional considerations

Session state

Elastic Load Balancing

Two minute warning

Session state for the web application in DynamoDB. • Data replicated across availability zones.

You can also choose other databases to maintain state in your architecture.

• Amazon RDS using Multi-AZ deployments• Amazon Elasticache

Where to store the state?

Since Spot fleet is configured to span across multiple Availability Zones, we highly recommend enabling cross-zone load balancing for the load balancer.

To allow in-flight requests to complete when de-registering Spot instances that are about to be terminated, connection draining can be enabled on the load balancer with a timeout of 90 seconds.

Elastic Load Balancing

Capitalizing on two minute warning

When the Spot price exceeds your bid price, the instance will receive a two-minute warning

Check for the 2 minute spot instance termination notification every 5 seconds leveraging a script invoked at instance launch

Sample script – two minutes left!

1) Check for 2 minute warning

2) If YES, detach instance from ELB

3) OTHERWISE, do nothing

4) Sleep for 5 seconds

$ if curl -s http://169.254.169.254/latest/meta- data/spot/termination-time | \ grep -q .*T.*Z; then instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id); \ aws elb deregister-instances-from-load-balancer \ --load-balancer-name my-load-balancer \ --instances $instance_id; /env/bin/flushsessiontoDBonterminationscript.sh; fi

For those of you - Using Auto Scaling

Two Auto Scaling groups

•On-demand + Reserved for base use•Add an additional Auto Scaling group with Spot

Both Auto Scaling groups behind the same Elastic Load Balancer.

Use the bid advisor to select the right instance time for your application.

Web Application Architecture with Spot

Elastic LoadBalancing

Stateless Web Servers

Stateless Web Servers

On Demand Auto Scaling group

Session State Data

Stateless Web Servers (Spot)

Stateless Web Servers (Spot)

Spot Auto Scaling group

Availability Zone A

Availability Zone B

On-Demand ASG

Spot ASG

Batch Processing with Amazon EC2 Spot

Batch oriented applications can leverage on-demand processing using EC2 Spot to save up to 90% cost:

Batch Processing with Amazon EC2 Spot

Monte Carlo simulation

Molecular modeling

Media processing

High energy simulations

AWS cloud

Region

Amazon S3

DynamoDB

Amazon SQS

CloudWatch172.16.0.0/16

Internet gateway

region-1a - 172.16.0.0/20

region-1b - 172.16.16.0/20

region-1c - 172.16.32.0/20

region-1d - 172.16.48.0/20

ClusterController

c3.4x Spot

r3.4x Spot c3.4x Spot

r3.4x Spot

c3.4x Spot

r3.4x Spot

Grid processing with Amazon EC2 Spot

Common method Batch Processing

EC2 Spot fleet to setup a heterogeneous, scalable “grid” of EC2 spot instances with multiple capacity pools as worker nodes

Scaling to 50,000 cores

EC2 Spot blocks for less flexible jobs that must run continuously.

Multiple capacity pools

How flexible are you?

2. Across Families

3. Across Zones

1. Within family

Review and Launch{ "AllocationStrategy": "diversified", "TargetCapacity": 1000, "SpotPrice": "0.0025", "LaunchSpecifications": [ { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0131", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0131", "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge", "WeightedCapacity": 32, "SpotPrice": "0.0131", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge", "WeightedCapacity": 32, "SpotPrice": "0.0131", "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c4.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0138", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c4.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0138", "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c4.8xlarge", "WeightedCapacity": 36, "SpotPrice": "0.0122", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c4.8xlarge", "WeightedCapacity": 36, "SpotPrice": "0.0122", "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "cc2.8xlarge", "WeightedCapacity": 32, "SpotPrice": "0.0156", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "cc2.8xlarge", "WeightedCapacity": 32, "SpotPrice": "0.0156", "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "d2.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0431", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "d2.4xlarge", "WeightedCapacity": 16, "SpotPrice": "0.0431", "SubnetId": "subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "d2.8xlarge", "WeightedCapacity": 36, "SpotPrice": "0.0383", "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "d2.8xlarge", "WeightedCapacity": 36, "SpotPrice": "0.0383", "SubnetId": "subnet-64531413" } ]}

Results - GridRequested 1000 vCores over 30 days

Minimum 960 vCoresMode 1024 vCoresAverage 1012 vCores

Average Price of $0.012 per vCore

Savings of over 80%

Capitalizing on two minute warning

When the Spot price exceeds your bid price, the instance will receive a two-minute warning

Check for the 2 minute spot instance termination notification every 5 seconds leveraging a script invoked at instance launch

Sample script – two minutes left!

1) Check for 2 minute warning

2) If YES, run shutdown scripts

3) OTHERWISE, do nothing

4) Then sleep for 5 seconds

#!/bin/bashwhile true do if curl -s http://169.254.169.254/latest/meta-data/spot/termination-time | grep -q .*T.*Z; then /env/bin/runterminationscripts.sh; else # Spot instance not yet marked for termination. sleep 5 fidone

Run continuously for up to 6 hours

Save up to 50% off On-Demand pricing

Don’t forget Blocks!

Using a single additional Parameter

$1

Whodunit?

Core Count

./aws_spot_fleet_request -p reinvent --cpu 8 --ram 64 -m 4.7  -c 1500

Rendering in the Cloud vs. On-Premise

Lower is better

Lessons Learned

• Use as many different instance types as you can. Especially older generations.

• Think about ways to modify your workload

• Use every availability-zone• Check your limits, especially your EBS limit and VPC setup (address space)

• Resource-Oriented Bidding

• Diversified Allocation

• Benchmark your workload and set pricing accordingly

• Set ONLY realistic pricing that you will pay for

• Don’t be afraid to ask for help or pre-planning your run from AWS

Summary

AWS is Spare Capacity at Scale

We Do the Heavy Lifting

You Pocket the Savings!

$1

Getting started

Try the Bid Advisor Fire up the Spot Console Block some time

Reference LinksEC2 Spot Documentation:http://aws.amazon.com/ec2/spot/ http://aws.amazon.com/ec2/spot/bid-advisor/ http://aws.amazon.com/ec2/spot/getting-started/ http://aws.amazon.com/ec2/spot/faqs/ http://aws.amazon.com/ec2/spot/testimonials/

User Guidehttp://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.htmlhttp://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html

Helpful AWS Blog Postshttps://aws.amazon.com/blogs/aws/focusing-on-spot-instances-lets-talk-about-best-practices/ https://aws.amazon.com/blogs/aws/building-price-aware-applications-using-ec2-spot-instances/https://aws.amazon.com/blogs/compute/cost-effective-batch-processing-with-amazon-ec2-spot/https://aws.amazon.com/blogs/compute/dynamic-scaling-with-ec2-spot-fleet/

Recommended