Transcript

Windows Azure ComputeBrad Calder

General ManagerWindows Azure

Usage

Com

pu

te

Time

Average

Inactivity

Period

ā€œOn and Off ā€œ RiskMetrics

On and off workloads (e.g. batch job)Over provisioned capacity is wasted

Com

pu

te

Time

ā€œUnpredictable Burstingā€œ

Twitter

Average Usage

Unexpected/unplanned peak in demand Sudden spike impacts performance Hard/costly to over provision for extreme cases

Average Usage

Com

pu

te

Time

ā€œGrowing Fastā€œDocs.com on Facebook

Successful services needs to grow/scale Keeping up w/growth is big IT challenge Can be hard to predict growth

Com

pu

te

Time

Average Usage

ā€œPredictable Burstingā€œWalmart

Services with micro seasonality trends Peaks due to periodic increased demandWasted capacity off season

Large-Scale Workload Patterns

12:00 AM1:54 AM 3:48 AM 5:42 AM 7:36 AM 9:30 AM11:24 AM 1:18 PM 3:12 PM 5:06 PM 7:00 PM 8:54 PM 10:48 PM

Japan Great Britain

BING SEARCHES ā€“ JAPAN VS. UK

Source: Microsoft

Computing Demand Daily FluctuationQ

uery

Volu

me

ā€¢ turbotax.com ā€¢ taxcut.com

ā€¢ hrblock.com ā€¢ taxact.com

Source: Alexa

~4x normal load(Holiday shopping)

~10x normal load(Tax season)

ā€¢ target.com ā€¢ walmart.com

ā€¢ toysrus.com ā€¢ barnesandnoble.com

Jan 2009 Jan 2010 Jan 2009 Jan 2010

Source: Alexa

Computing Demand Yearly Variability

Time

Dem

an

dWhat is a ā€œCloudā€?

ā€¢ Cloud: on-demand, scalable, compute and storage resources

TimeD

em

and

Self Server Provisioning Cloud Provisioning

OverprovisionedUnderprovisioned

What is Under the Covers of a Service

Business logic

Datacenter (Power and Cooling)

Respond to hardware failures

Monitoring and alerting infrastructureReliable/Secure storage and computation

Metering and billing infrastructureLive upgrades and OS patches

Add compute/storage capacity on the flyOverprovision for peak traffic

Service ā€œglueā€

ā€¦

Buy and provision hardware

What is Windows Azure?An operating system for the cloud:

ā€¦.Service 1 Service 2 Service NService 3

ā€¦ā€¦

Cloud Terminology

ā€¢ Infrastructure as a Service (IaaS): basic compute and storage resourcesā€¢ On-demand serversā€¢ Amazon EC2, VMWare vCloud, etc

ā€¢ Platform as a Service (PaaS): cloud application infrastructureā€¢ On-demand application-hosting environmentā€¢ Google AppEngine, Salesforce.com, Windows Azure, etc

ā€¢ Software as a Service (SaaS): cloud applicationsā€¢ On-demand applicationsā€¢ Office 365, GMail, etc

Operating System

Operating System

VM

WebServer

Operating System

VM

DBMS

2) Choose image, then create and configure VM(s) for

application

1) Choose image, then

create VM for DBMS and

configure DBMS

IaaS

Library

VM Images

Developer/Ops

ApplicationDataLoad

Balancer

5) Config

ure load

balancer

6) Manage VMs and DBMS (e.g.,

deploying new OS images in VMs)

3) Provision database,

then create tables and add data

4) Install

application

Operating System

Operating System

VM

Operating System

VM

DBMS

PaaS Developer/Ops

ApplicationDataLoad

Balancer

2) Deploy applicati

on w/ service model

WebServer

1) Provision database,

then create tables and add data

3) Automated Service Managem

ent

Windows Azure

ā€¢ Windows Azure is an OS for the data centerā€¢ Handles resource management, provisioning, and

monitoringā€¢ Manages application lifecycleā€¢ Allows developers to concentrate on business logic

ā€¢ Provides common building blocks for distributed applicationsā€¢ Reliable queuingā€¢ Simple unstructured and structured storageā€¢ SQL storageā€¢ Application services like access control, caching, and

connectivity

Windows Azure Platform

Fabric ControllerWindows Azure

Networking

AppFabric Caching

AppFabric Access Control Server

SQL Azure

AppFabric Service Bus

WindowsAzure

Compute

WindowsAzure

Middleware Services

Windows Azure Applications

Windows Azure Storage

Windows Azure CDN

WindowsAzure

Data Services

ā€¢ Owns all the hardware in the data centerā€¢ Uses the inventory to host servicesā€¢ Similar to what a per machine operating system

does with applicationsā€¢ Provisions the hardware as necessaryā€¢ Maintains the health of the hardwareā€¢ Deploys applications to free resourcesā€¢ Maintains the health of those applications

Fabric Controller

Windows Azure Fabric Controller

Highly-availableFabric Controller

Hardware control Software control

WS08 Hypervisor

VMVM

VM

Fabric

Agent

Switches

Load-balancers

Scaling with the Fabric Controller Service Model

Scaling

ā€¢ There are two basic scaling models:

Compute

Compute

Compute

Compute

Scale Up Scale Out

Compute

Scaling Lessons

ā€¢ Use few, well-defined scaling unitsā€¢ Define scaling boundariesā€¢ Scale out those units as needed

Scale-Out Applications

Network Load Balancer

Stateless ā€˜Workerā€™

Stateless Front End

Shared Filesystem

(Azure Blobs)

Partitioned RDBMS

(SQL Azure)

Key/ValueDatastore

(Azure Tables)

AzureQueues

Scale Out

Scale Out

AlreadyProvided ScalableStorage

The Windows Azure Service Modelā€¢ A Windows Azure application is called a ā€œserviceā€ā€¢ Definition informationā€¢ Configuration informationā€¢ At least one ā€œroleā€

ā€¢ A role is the scaling boundary withina serviceā€¢ Roles are like DLLs in the service ā€œprocessā€ā€¢ Collection of code with an entry point

that runs in its own virtual machineā€¢ Virtual machine is scale unit ā€¢ Role code runs in a virtual machine ā€¢ Role scales by instances of a virtual machine size

LB

Durable

Store

Front End

Middle Tier

Multi-Tier Cloud Applicationā€¢ A cloud application is typically made up of different

componentsā€¢ Front end: e.g. load-balanced stateless web serversā€¢ Middle worker tier: e.g. order processing, encodingā€¢ Backend storage: e.g. Azure Blobs, Azure Tables, SQL

Azureā€¢ Multiple instances of each for scalability and availabilityā€¢ Requires at least 2 instances of each to achieve the SLA

Front-End

Cloud Application

Front-End

HTTP/HTTPS

Windows

AzureStorag

e,SQL

Azure

Load Balancer Middle-

Tier

Service Model and Role Contents

ā€¢ Definition: ā€¢ Role nameā€¢ Role type ā€¢ VM size (e.g. small, medium, etc.)ā€¢ Network endpoints

ā€¢ Configuration:ā€¢ Number of instancesā€¢ Number of update and fault domains

ā€¢ Code: ā€¢ Web/Worker Role: Hosted DLL

and other executablesā€¢ VM Role: VHD

Service ModelRole: Front-End

DefinitionType: WebVM Size: SmallEndpoints: External-1ConfigurationInstances: 2Update Domains: 3Fault Domains: 2

Role: Middle-Tier

DefinitionType: WorkerVM Size: LargeEndpoints: Internal-1ConfigurationInstances: 3Update Domains: 3Fault Domains: 2

Service Model Files

ā€¢ Service definition is in ServiceDefinition.csdef

ā€¢ Service configuration is in ServiceConfiguration.cscfg

ā€¢ CSPack program Zips service binaries and definition into Service Package File (service.cscfg)

ServiceDefinition.csdef<?xml version="1.0" encoding="utf-8"?><ServiceDefinition name="Sample" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition" upgradeDomainCount=ā€œ3"> <WorkerRole name="Middle-Tier" vmsize="Large"> <Endpoints> <InternalEndpoint name="Internal-1" protocol="tcp" /> </Endpoints> </WorkerRole> <WebRole name="Front-End" vmsize="Small"> <Sites> <Site name="Web"> <Bindings> <Binding name="Endpoint1" endpointName="External-1" /> </Bindings> </Site> </Sites> <Endpoints> <InputEndpoint name="External-1" protocol="http" port="80" /> </Endpoints> <Imports> <Import moduleName="Diagnostics" /> </Imports> </WebRole></ServiceDefinition>

ServiceConfiguration.cscfg<?xml version="1.0" encoding="utf-8"?><ServiceConfiguration serviceName="Sample" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily=ā€œ2"

osVersion="*"> <Role name="Middle-Tier"> <Instances count="3" /> </Role> <Role name="Front-End"> <Instances count="2" /> <ConfigurationSettings> <Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString" value="DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=key" /> </ConfigurationSettings> </Role></ServiceConfiguration>

Windows Azure Push-button Deploymentā€¢ Step 1: Allocate VMs/nodes,

VDIPs/VIPsā€¢ Across fault domainsā€¢ Across update domains

ā€¢ Step 2: Place role images on nodes

ā€¢ Step 3: Start roles in VM instances

ā€¢ Step 4: Configure load-balancersā€¢ Step 5: Maintain desired number

of role instancesā€¢ Failed roles automatically

restartedā€¢ Node failure results in new VMs

automatically allocated

Allocation across fault and update domains

Load-balancers

ā€¢ Windows Azure FC monitors the health of rolesā€¢ FC detects if a role diesā€¢ Restart the role to bring it back to a healthy state

ā€¢ If a failed node canā€™t be recovered, FC migrates role instances to a new nodeā€¢ A suitable replacement location is foundā€¢ Existing role instances are notified of the

configuration change

FC Automated Management

Availability andFault/Upgrade Domains

Availability Service Level Agreements (SLA)

ā€¢ Windows Azure Platform SLAs:ā€¢ Compute External Connectivity: 99.95% (2 or more

instances)ā€¢ Storage Availability: 99.9%ā€¢ SQL Azure Availability: 99.9%

Availability % Downtime per yearDowntime per month*

Downtime per week

99% ("two nines") 3.65 days 7.20 hours 1.68 hours

99.9% ("three nines")

8.76 hours 43.2 minutes 10.1 minutes

99.99% ("four nines")

52.56 minutes 4.32 minutes 1.01 minutes

99.999% ("five nines")

5.26 minutes 25.9 seconds 6.05 seconds

Maintaining Availability: Assume Failure

ā€¢ Hardware fails ā€¢ 3-5% of servers experience failures annually

ā€¢ Software failsā€¢ Inevitable in any evolving, complex system

ā€¢ Tolerating failure means:ā€¢ Redundancy where possibleā€¢ Need to build in retries and backoffā€¢ Fast recoveryā€¢ Big red buttons

Hardware Redundancy

TOR

LB LBAgg

PDU

LB LBAgg

LB LBAgg

LB LBAgg

LB LBAgg

LB LBAgg

Racks

Datacenter

Routers

Aggregation Routers and

Load Balancers

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

TOR

PDU

ā€¦ā€¦ā€¦ ā€¦ ā€¦

Top of RackSwitches

Power Distribution

Units

ā€¦

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Nodes

Top of Rack Switch is a Single Point of Failure

Maintaining Availability: Fault Domains

ā€¢ Avoid single points of failureā€¢ Unit of failure based on

data center topology is a rackā€¢ E.g. top-of-rack switch on a rack of

machines

ā€¢ Windows Azure considers fault domains when allocating service rolesā€¢ At least 2 fault domains per serviceā€¢ Will try and spread roles out across

more

Front-End-1

Fault Domain 1

Fault Domain

2

Front-End-2

Middle Tier-2

Middle Tier-1

Fault Domain 3

Middle Tier-3

Front-End-1

Middle Tier-1

Front-End-

2

Middle

Tier-2

Middle

Tier-3

ā€¢ Update domains specifies what percentage of your service you will take offline for an upgradeā€¢ Specify the # of update domains for

your serviceā€¢ Default is 5 and max is 20

ā€¢ Roles are evenly assigned an update domain

ā€¢ Used to update only one domain at a timeā€¢ Rolling update

Update Domains

Upgrade domains

allocated across fault domains

Fault domains

Service Deployment and Maintenance

Containing Failure: Datacenter Clusters

ā€¢ Datacenters are divided into ā€œclustersā€ā€¢ Approximately 1000 rack-mounted servers (we call them ā€œnodesā€)ā€¢ Provides a unit of fault isolation

ā€¢ Each cluster is managed by a Fabric Controller (FC)

Cluster1

Cluster2

Clustern

ā€¦

Datacenter network

FC FC FC

The Fabric Controller (FC)

ā€¢ The ā€œkernelā€ of the cloud operating systemā€¢ Manages datacenter hardwareā€¢ Manages Windows Azure services

ā€¢ Four main responsibilities:ā€¢ Datacenter resource allocationā€¢ Datacenter resource provisioningā€¢ Service lifecycle managementā€¢ Service health management

ā€¢ Inputs:ā€¢ Description of the hardware and network resources it will

controlā€¢ Service model and binaries for cloud applications

Service Deployment Stepsā€¢ Process service model files

ā€¢ Determine resource requirementsā€¢ Create role images

ā€¢ Allocate compute and network resourcesā€¢ Prepare nodes

ā€¢ Place role images on nodesā€¢ Create virtual machinesā€¢ Start virtual machines and roles

ā€¢ Configure networkingā€¢ Dynamic IP addresses (DIPs) assigned to nodesā€¢ Virtual IP addresses (VIPs) + ports allocated and mapped to sets of

DIPsā€¢ Configure packet filter for VM to VM traffic within serviceā€¢ Program load balancers to allow traffic to external endpoints

Service Resource Allocationā€¢ Goal: allocate service components to available resources

while satisfying all hard constraints ā€¢ Size of VM

ā€¢ HW requirements: CPU, Memory, Storage, Networkā€¢ Upgrade domainsā€¢ Fault domains

ā€¢ Secondary goal: Satisfy soft constraints ā€¢ Optimize network proximity: pack different roles into same node

Deploying a ServiceRole B

Middle-Tier RoleCount: 3

Update Domains: 3Size: Large

Role AFront-End Role

(Front End)Count: 2

Update Domains: 3Size: Medium

LoadBalance

r

10.100.0.36

10.100.0.122

www.mycloudapp.net

www.mycloudapp.net

Fault domain

Upgrade domain

Inside a Deployed Node

Fabric Controller (Primary)

FC Host Agent

Host Partition

Guest Partitio

nGuest Agent

Guest Partitio

nGuest Agent

Guest Partitio

nGuest Agent

Guest Partitio

nGuest Agent

Physical Node

Fabric Controller (Replica)

Fabric Controller (Replica)ā€¦

Role Instance

Role Instance

Role Instance

Role Instance

Trust boundary Image Repository

(OS VHDs, role ZIP files)

Detection: Load Balancer Operation

ā€¢ FC programs load balancers (LB) to ā€œprobeā€ guest agent (GA) every 15 secondsā€¢ If the guest misses two probes, the LB stops forwarding

trafficā€¢ The role can report ā€œbusyā€ status to the GA ā€¢ GA stops responding to probes

ā€¢ LB keeps an idle connection open for 60sā€¢ Use keep-alive commands if the connection needs to be

open longer

Recovery: Server and Role Health

ā€¢ FC maintains service availability by monitoring the software and hardware healthā€¢ Based primarily on heartbeats ā€¢ Automatically ā€œhealsā€ affected rolesProblem Fabric Detection Fabric Response

Role instance crashes FC guest agent monitors role termination FC restarts role

Guest VM or agent crashes FC host agent notices missing guest agent heartbeats

FC restarts VM and hosted role

Host OS or agent crashes FC notices missing host agent heartbeat Tries to recover nodeFC reallocates roles to other nodes

Detected node hardware issue Host agent informs FC FC migrates roles to other nodesMarks node ā€œout for repairā€

Updating Your Serviceā€¢ There are two update types:ā€¢ In-place: used for large scale services and used to

updated services with local stateā€¢ VIP swap: for ease of testing and fail-back for smaller

servicesā€¢ In-place (rolling) update:ā€¢ Role instances updated one update domain at a timeā€¢ Two modes: automatic and manual

In-Place Update

ā€¢ Purpose: Ensure service stays up while updatingā€¢ Used by Windows Azure OS updates

ā€¢ System considers update domains when upgrading a serviceā€¢ 1/Update domains = percent of

service that will be offlineā€¢ Default is 5 and max is 20

ā€¢ The Windows Azure SLA is based on at least two update domains and two role instances of each role

Front-End-

1

Front-End-

2

Update Domain 1

Update Domain

2

Middle

Tier-1

Middle

Tier-2

Middle

Tier-3

Update Domain

3

Middle Tier-

3

Front-End-2

Front-End-

1

Middle Tier-

2

Middle

Tier-1

Windows Azure Compute Summaryā€¢ Platform as a Service is all about reducing

management and operations overheadā€¢ The Windows Azure Fabric Controller is the

foundation for Windows Azureā€™s PaaSā€¢ Provisions machinesā€¢ Deploys servicesā€¢ Configures hardware for servicesā€¢ Monitors service and hardware health


Recommended