Windows azure best practices - Dmitry Martynov

Preview:

DESCRIPTION

Windows Azure best practices by Microsoft PSA Dmitry Martynov - on CloudCamp Moscow

Citation preview

Windows Azure Best Practices of Scalability and AvailabilityDmitri MartynovMicrosoft Corporation

Powered by Windows Azure Training Kit

Agenda

As the load increases, are you still available?

If platform fails are you still available?

Thinking Globally

If the compute is closer to the user, what about the dependencies?

Why do services fail?

Increased workload

FailureHardwareNetwork Platform ServiceTransient conditions

HumanUpgrades

What do we mean by scalable?

More Resources

Redundancy

Fault-Tolerance

2

Scale

6 92

What do we mean by available?

Same functionality

Degraded functionality

Failsafe

As the load increases, are you still available?

It is better to have 50 x 1GB database than 1 x 50GB database

What is wrong with this?

Scale me out too

Everything needs to scale

What about this?

As the load increases, are you still available?Scale everything OUTPartition data (for size AND performance)Split work (queues, async pattern)

Do less workExternalize cache to Azure Cache or memcached Do not depend on VM instance state (memory, disks)

FeedbackEnable Windows Azure Diagnostics*Setup external monitoring

*May increase problem – scale that too

SQL Azure Federationsaka Sharding or Horizontal Partitioning

demo

Asynchronous Design PatternEach thread picks up work whenever it is readyA thread handling one request may handle another before the first one completes

Client Request #1Web App Front End

Client Response #1Client Request #2

“The Work” #1

Response #1Thread Thread

Client Response #2

“The Work” #2

Response #2

This approach scales wellClient requests tracked explicitly in app’s data structuresThreads never block while there is work to be doneEach thread can handle possibly many concurrent requests

But bookkeeping & synchronization can be difficult…

WA Storage Acct

Partitioned Table

Queue

Partitioning & Sharding

Hosted Compute

WA Storage Acct

Partitioned Table

Queue

WA Storage Acct

Partitioned Table

Queue

Trick: Partition & shard at the data tier

SQL Azure single DB limits

150 GB capacity, will expand over time

Overuse of more than one node’s worth of resources may result in throttling

http://social.technet.microsoft.com/wiki/contents/articles/sql-azure-connection-management.aspx

Windows Azure Storage scalability targets

Each account supports 100 TB capacity, 5K transactions/sec, 3 Gbps bandwidth

500 messages/sec per queue

500 entities/sec per table partition (multiple partitions permitted per table)

60 MB/sec per blob

Shared Access Signatures

Blob Storage

Also works for write access (e.g. user-generated content)http://blog.smarx.com/posts/shared-access-signatures-are-easy-these-days

Non-public blob(e.g. paid or ad-funded content)X

Trick: Shared access signatures provide direct access to ACLed contentCan be time-bound or revoked on demand

If platform fails are you still available?

Basics – what you get for free

ElasticityEasily deploy compute resources and scale up and down

Automated Service ManagementWindows Azure will (automatically) recover bad nodes

Fault DomainsWindows Azure deploys services across fault boundaries

Storage Resilience3 copies of storage maintained

Fault ToleranceWhen Windows Azure breaks, it fixes itself!Can your service?

Codifying OperationsUpgrade DomainsConfigure in ServiceDefinition.csdef<ServiceDefinition name="RedDir"xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition" upgradeDomainCount="3">

Transient Datacenter ConditionsDo you have Retry Logic?

What did you mean, retry logic?Transient conditions in the datacenter/network/serviceExample:SQL Azure Error 40501The service is currently busy. Retry the request after 10 seconds.

Transient Fault Handling Frameworkhttp://windowsazurecat.com/2011/02/transient-fault-handling-framework/

Retry against anything that might be external and have transient conditions*:SQL AzureWindows Azure StorageService Bus3rd Party Services

How do you upgrade your service?

Upgrade Strategies: VIP Swap

Upgrade Strategies: Upgrade

WEB WORKER WEB WORKER

Upgrade StrategiesNew Service & Swap DNS

Thinking Globally

Thinking Globally

Network latencyPut compute closer to user.Put data closer to user.

Global availabilityDatacenter outages.Synchronizing data.

Network Latency

Serve Blobs from the Edge24 global locations with 99.95% availability

CDN now works for web apps, not just for public blobs

CDN Blob StorageClosest Point of Presence

Possibly many hops or poor links

Few hops

Windows Azure Traffic ManagerDirect users to the service in the closest region with the Windows Azure Traffic Manager

Policies Monitoring

foo.cloudapp.net

DNS response

1.2.3.4

Traffic Manager

demo

SQL Azure Data Sync and Windows Azure Traffic Manager

US

SQL Azure

Application

SQL Azure

Application

Sync

Europe

SQL Azure

Application

Asia

Sync

Traffic Manager

Distribute traffic between

Azure-hosted applications

DNS-based

Distribution Options:

Performance

Round Robin

Failover

Synchronizing Data

demo

Site Failover

If a site specific dependency is out, fail over to another site

Easy: Use Traffic Manager

Hard: Code your own

Site Failover

demo

Summary

Windows Azure gives you high availability capabilities for freeThink about scaling outHandle transient conditions

Codify operationsAutomate redeployments etc.

Use Global Features for maximum availability & reachWindows Azure Traffic ManagerSQL Data Sync

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to

be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION

IN THIS PRESENTATION.