Upload
phoebe-arline-matthews
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Outline
ScalabilityAchieving linear scaleScale Up vs. Scale Out in Windows AzureChoosing VM Sizes
CachingApproaches to cachingCache storage
ElasticityScale out, scale backAutomation of scaling
A Primer on Scale
Scalability is the ability to add capacity to a computing system to allow it to process more work
A Primer On Scalability
Vertical Scale UpAdd more resources to a single computation unit i.e. Buy a bigger boxMove a workload to a computation unit with more resourcese.g. Windows Azure Storage moving a partition.
Horizontal Scale OutAdding additional computation units and having them act in concertSplitting workload across multiple computation units
Vertical vs. Horizontal
For small scenarios scale up is cheaper
Code ‘just works’For larger scenarios scale out only solution
Massive diseconomies of scale1 x 64 Way Server >>>$$$ 64 x 1 Way Servers.
Shared resource contention becomes a problem
Scale out offers promise of linear, infinite scale
Computation Units
Th
rou
gh
pu
t
Roughly Linear Scalei.e. Additional throughput achieved by each additional unit remains constant
Non Linear Scalei.e. Additional throughput achieved by each additional unit decreases as more are added
Scalability != Performance
Often you will sacrifice raw speed for scalabilityFor example; ASP.NET session state
In Process ASP.NET Session State
SQL Server ASP.NET Session State
Achieving Linear Scale Out
Reduce or Eliminate Shared ResourcesMinimize reliance on transactions or transactional type behaviourHomogenous, Stateless computation nodes
We can then use simple work distribution methodsLoad balancers, queue distributionLess reliance on expensive hardware H/A
Units of Scale
Clean Up Role WCF RoleWeb Site Role’
Cache Build Role
Create as many roles as you need ‘knobs’ to adjust scale
Loss of an instance results in 50% capacity loss in web site.Queue Drive
Role
Web Driven Role
Loss of an instance results in just 25% capacity loss in web site.
Consolidation of Roles provides more redundancy for same cost
VM Size in Windows Azure
Windows AzureSupports Various VM Sizes~800mb/s NIC shared across machineSet in Service Definition (*.csdef).All instances of role will be equi-sized
<WorkerRole name=“myRole" vmsize="ExtraLarge">
Size CPU Cores Network RAM Local Storage
Cost
Small 1 Shared 1.7GB 250GB 1 x
Medium 2 Shared 3.5GB 500GB 2 x
Large 4 Shared 7GB 1000GB 4 x
Extra large
8 Dedicated 15GB 2000GB 8 x
Remember:
If it doesn’t run faster on multiple cores on your
desktop … It’s not going to run faster
on multiple cores in the cloud!
Choosing Your VM Size
Don’t just throw big VMs at every problemScale out architectures have natural parallelismTest various configurations under loadSome scenarios will benefit from more cores
Where moving data >$ parallel overheadE.g. Video processing
Stateful servicesDatabase server requiring full network bandwidth
Caching
Caching can improve both performance and scalability
Moving data closer to the consumer (Web/Worker) improves perfReducing load on the hard to scale data tier
Caching Is The Easiest Way To Add Performance and Scalability To Your
Application
In Windows Azure: Caching Will Save You Money!
Caching Scenario: Website UI ImagesWebsite UI Images
Largely static dataIncluded in every page
Goal: A Better UIServe content onceAvoid round trip unless content changesMinimise traffic over the wireFewer storage transactionsLower load on web roles
Caching Scenario: RSS Feeds
Regular RSS FeedData delivered from database/storageLarge content payload>1mbData changes irregularlyCost determined by client voracity
Goal: A Better RSS Feed
Minimise traffic over the wireFewer storage transactionsLess hits on database
Client Caching - ETags
ETag == Soft CachingHeader added on HTTP ResponseETag: “ABCDEFG”Client does conditional HTTP GET
If-None-Match: “ABCDEFG”Returns content if ETag no longer matches
Implemented natively by Windows Azure Storage
Supports client side cachingAlso used for optimistic concurrency control
Client Caching - ETags
BenefitsPrevents client downloading un-necessary dataOut of the box support for simple ‘static content’ scenarios.
ProblemsStill requires round trip to serverMay require execution of server side code to re-create ETag before checking
string etag = Request.Headers["If-None-Match"];
if(String.Compare(etag, GetLastBlogPostIDAzTable()) == 0) {
Response.StatusCode = 412;return;
}
Client Caching – Cache-Control
Cache-Control: max-age == Hard Caching
Header added on HTTP ResponseCache-Control: max-age=2592000
Client may cache file without further request for 30 days
Client will not re-check on every request
Very useful for static filesheader_logo.png
Used to determine TTL on CDN edge nodesSet this on Blob using
x-ms-blob-cache-control
Client Caching – Cache-Control
BenefitsPrevents un-necessary HTTP requestsPrevents un-necessary downloads
ProblemsWhat if files do change in the 30 days?
<img src=http://*.blob.*/Container/header_logo.png ?random=<rnd>/>
<img src=http://*.blob.*/Containerv1.0/header_logo.png /><img src=http://*.blob.*/Containerv2.0/header_logo.png />
<img src=http://*.blob.*/Container/header_logo.png ?snapshot=<DT1>/><img src=http://*.blob.*/Container/header_logo.png ?snapshot=<DT2>/>
Windows Azure Technique:Put static files in Blob storage use Cache-Control + URL FlippingSimple randomization == simple but no versioningContainer level flipping == simple but more expensiveSnapshot level flipping == more complex but lower cost
Static Content Generation
Generate Content Periodically in Worker Role
Can spin up workers just for generationGenerate as triggered async operation
Content May BeFull pagesResources (CSS Sprites, PDF/XPS, Images etc…)Content fragments
Push static content into Blob storageServe direct out of Blob storageMay also be able to use persistent local storage
Static Content Generation
BenefitsReduce load on web rolesPotentially reduce load on data tierResponse times improvedCan combine with Cache-Control and ETags
ProblemsNeed to deal with stale data
Manage/RefreshIgnore
A Better RSS Feed?
Build standard RSS Feed in Web RoleGenerate content dynamically from storageSerialize as RSS using Feed FormattersPlace on obfuscated (hidden) URL
Build a worker role to poll hidden RSS feedRetrieve RSS content at certain intervals or on eventPush content into a Blob if changed
Serve RSS to users from Blob storageTake advantage of E-TagsZero load on database or RSS tables to serve content
BLOBs vs. Compute Instances
BLOB StorageDisk Based
15c/GB/Month1c/10,000 requests
Compute InstancesRAM and Disk Based
12c/hrper 1GB RAMper 250GB disk
Dedicated compute cache roles must serve at least 120,000 cache requests per hour to be cheaper than Windows Azure storage
Outside USA and Europe: use CDN for caching due to much lower bandwidth costs
Usage
Com
pu
te
Time
Average
Inactivity
Period
“On and Off “
On & off workloads (e.g. batch job)Over provisioned capacity is wasted Time to market can be cumbersome
Com
pu
te
Time
“Unpredictable Bursting“
Average Usage
Unexpected/unplanned peak in demand Sudden spike impacts performance Can’t over provision for extreme cases
Average Usage
Com
pu
te
Time
“Growing Fast“
Successful services needs to grow/scale Keeping up w/ growth is big IT challenge Cannot provision hardware fast enough
Com
pu
te
Time
Average Usage
“Predictable Bursting“
Services with micro seasonality trends Peaks due to periodic increased demandIT complexity and wasted capacity
Elastic Cloud Workflow Patterns
Dealing with Variable Load
1. Dealing with variable load takes two forms
2. Maintaining excess capacity or headroom• Costs: paying for unused capacity• Faster availability• Async work pattern can provide buffer
3. Adding/Removing additional capacity• Takes time to spin up• Requires management- human or
automated• Pre-emptive or metric driven
Head Room in Windows Azure
Web RolesRun additional web rolesHandle additional load before performance degrades
Worker RolesIf possible just buffer into queuesWill be driven by tolerable level of latencyStart additional roles only if queues not clearingUse generic workers to pool resources
Head Room in Windows Azure Services
Windows Azure StorageStorage nodes serve many partitionsPartition served by a single storage nodeFabric can move to a different storage nodeOpaque to the Windows Azure customer
SQL AzureNon-deterministic throttle gives little indicationRun extra instances – requires DB sharding
Adding Capacity in Windows Azure
Web Roles/Worker RolesEnable more instances (API or *.config)Editing instance count in config leaves existing instances runningChange to using larger VMs- will require redeploy.
Windows Azure StorageOpaque to userPartition aggressivelyCan ‘heat up’ a partition to encourage scale up
Adding Capacity in SQL Azure
SQL AzureAdd more databases (more partitions)Very difficult to achieve mid-stream
Requires moving hot dataMaintaining consistency across multiple DBs without DTC
Will depend on partitioning strategy
Rule Based Scaling
Use Service Management and Diagnostics APIsOn/Off and Predictable Bursting
Time based rules
Unpredictable demand and Fast GrowthMonitor metrics and react accordingly
Diagnostics & Management APIs
Monitor InputsHistorical DataTransactionsPerf CountersBusiness KPIs
Evaluate Biz RulesLatency too
high/lowHow much $
spentAre we at limitPredicted load
Action+/- instance count
Deploy new service
Increase queuesSend notifications
Monitor metrics
Primary metrics (actual work done)Requests per SecondQueue messages processed / interval
Secondary metricsCPU UtilizationQueue lengthResponse time
Derivative metricsRate of change of queue lengthUse ‘historical’ data to help predict requirements
Gathering Metrics
Use Microsoft.WindowsAzure.Diagnostics.*Capture various metrics via Management API
Diagnostics Infrastructure LogsEvent LogsPerformance CountersIIS Logs
May need to smooth/average some measuresRemember the cost of gathering data
Both performance and financial costsWould you use Perf Counters 24/7 on a production system? http://tinyurl.com/perfmon-overhead
Evaluating Business Rules
Are requests taking too long?Do I have too many jobs in my queue?How much money have I spent this month?
Could write these into code.Could build some sort of rules engine.Could use the WF rules engine.
Take Action
Add/Remove InstancesUse Service Management APIDon’t forget billing window is 1hr
Change role sizeRequires change to *.csdefMost suited to Worker Roles
Send notificationsEmailIM
Manage momentumBe careful not to overshoot
Summary
Designing for multiple instances provides
Scale outAvailabilityElasticity options
Caching should be a key component of any Windows Azure applicationVarious options for variable load
Spare capacityScale Out/BackAutomation possible
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.