52
Windows Azure Tables and Queues Deep Dive Jai Haridas Software Design Engineer Microsoft Corporation SVC09

Windows Azure Tables and Queues Deep Dive

  • Upload
    iliana

  • View
    77

  • Download
    0

Embed Size (px)

DESCRIPTION

SVC09. Windows Azure Tables and Queues Deep Dive. Jai Haridas Software Design Engineer Microsoft Corporation. Agenda. Overview of Windows Azure Tables Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues - PowerPoint PPT Presentation

Citation preview

Page 1: Windows Azure Tables and Queues Deep Dive

Windows Azure Tables and Queues Deep Dive

Jai HaridasSoftware Design EngineerMicrosoft Corporation

SVC09

Page 2: Windows Azure Tables and Queues Deep Dive

Agenda

1. Overview of Windows Azure Tables

2. Patterns and Practices for Windows Azure Tables

3. Overview of Windows Azure Queues

4. Patterns and Practices for Windows Azure Queues

5. Q&A

2

Page 3: Windows Azure Tables and Queues Deep Dive

Fundamental Storage Abstractions

> Tables – Provide structured storage. A Table is a set of entities, which contain a set of properties

> Queues – Provide reliable storage and delivery of messages for an application

> Blobs – Provide a simple interface for storing named files along with metadata for the file

> Drives – Provides durable NTFS volumes for Windows Azure applications to use (new)

3

Page 4: Windows Azure Tables and Queues Deep Dive

Windows Azure Tables> Provides Structured Storage

> Massively Scalable Tables> Billions of entities (rows) and TBs of data> Can use thousands of servers as traffic

grows

> Highly Available & Durable> Data is replicated several times

> Familiar and Easy to use API> ADO.NET Data Services – .NET 3.5

SP1> .NET classes and LINQ> REST – with any platform or language4

Page 5: Windows Azure Tables and Queues Deep Dive

Table Storage Concepts

EntitiesTablesAccounts

moviesonline

Users

Movies

Email =…Name = …

Email =…Name = …

Genre =…Title = …

Genre =…Title = …

5

Page 6: Windows Azure Tables and Queues Deep Dive

Table Data Model

> Table> A storage account can create many

tables> Table name is scoped by account> Set of entities (i.e. rows)

> Entity> Set of properties (columns)> Required properties

> PartitionKey, RowKey and Timestamp

6

Page 7: Windows Azure Tables and Queues Deep Dive

Required Entity Properties

> PartitionKey & RowKey> Uniquely identifies an entity> Defines the sort order> Use them to scale your application

> Timestamp > Read only> Optimistic Concurrency

7

Page 8: Windows Azure Tables and Queues Deep Dive

PartitionKey And Partitions

> PartitionKey> Used to group entities in the table into

partitions

> A table partition > All entities with same partition key value> Unit of scale> Control entity locality> Row key provides uniqueness within a

partition

8

Page 9: Windows Azure Tables and Queues Deep Dive

PartitionKey(Category)

RowKey(Title)

Timestamp

ReleaseDate

Action Fast & Furious … 2009

Action The Bourne Ultimatum

… 2007

… … … …

Animation

Open Season 2 … 2009

Animation

The Ant Bully … 2006PartitionKey(Category)

RowKey(Title)

Timestamp

ReleaseDate

Comedy Office Space … 1999

… … … …

SciFi X-Men Origins: Wolverine

… 2009

… … … …

War Defiance … 2008

PartitionKey(Category)

RowKey(Title)

Timestamp

ReleaseDate

Action Fast & Furious … 2009

Action The Bourne Ultimatum

… 2007

… … … …

Animation

Open Season 2 … 2009

Animation

The Ant Bully … 2006

… … … …

Comedy Office Space … 1999

… … … …

SciFi X-Men Origins: Wolverine

… 2009

… … … …

War Defiance … 2008

Partitions and Partition Ranges

Server BTable = Movies

[Comedy- Western)

Server ATable = Movies[Action - Comedy)

9

Server ATable = Movies

Page 10: Windows Azure Tables and Queues Deep Dive

Table Operations

> Table> Create> Query> Delete

> Entities> Insert> Update

> Merge – Partial Update> Replace – Update entire entity

> Delete> Query> Entity Group Transaction (new)

Page 11: Windows Azure Tables and Queues Deep Dive

Define the schema as a .NET class

Table Schema

11

[DataServiceKey("PartitionKey", "RowKey")] public class Movie { /// <summary> /// Category is the partition key /// </summary> public string PartitionKey { get; set; }

/// <summary> /// Title is the row key /// </summary> public string RowKey { get; set; }

public DateTime Timestamp { get; set; }

public int ReleaseYear { get; set; } public string Language { get; set; } public string Cast { get; set; } }

Page 12: Windows Azure Tables and Queues Deep Dive

Table SDK Sample Code

12

StorageCredentialsAccountAndKey credentials = new StorageCredentialsAccountAndKey( “myaccount", “myKey");string baseUri = "http://myaccount.table.core.windows.net";

CloudTableClient tableClient = new CloudTableClient(baseUri, credentials);

tableClient.CreateTable(“Movies");

TableServiceContext context = tableClient.GetDataServiceContext();CloudTableQuery<Movie> q = (from movie in context.CreateQuery<Movie>(“Movies")

where movie.PartitionKey == “Action" && movie.RowKey == "The Bourne Ultimatum"

select movie).AsTableServiceQuery<Movie>();Movie movieToUpdate = q.FirstOrDefault();

// Update moviecontext.UpdateObject(movieToUpdate);context.SaveChangesWithRetries();

// Add moviecontext.AddObject(new Movie(“Action" , movieToAdd));context.SaveChangesWithRetries();

Page 13: Windows Azure Tables and Queues Deep Dive

Agenda

1. Overview of Windows Azure Tables

2. Patterns and Practices for Windows Azure Tables

3. Overview of Windows Azure Queues

4. Patterns and Practices for Windows Azure Queues

5. Q & A

13

Page 14: Windows Azure Tables and Queues Deep Dive

Key Selection: Things to Consider> Scalability

> Distribute load as much as possible> Hot partitions can be load balanced> PartitionKey is critical for scalability

> Query Efficiency & Speed> Avoid frequent large scans> Parallelize queries

> Entity group transactions (new)> Transactions across a single partition> Transaction semantics & Reduce round

trips14

Page 15: Windows Azure Tables and Queues Deep Dive

Key Selection: Case Study 1

> Table for listing all movies> Home page lists movies based on chosen

category

15

Page 16: Windows Azure Tables and Queues Deep Dive

Movie Listing – Solution 1> Why do I need multiple PartitionKeys?

> Account name as Partition Key > Movie title as RowKey since movie names

need to be sorted> Category as a separate property

> Does this scale?

16

PartitionKey(Account name)

RowKey(Title)

Category …

moviesonline 12 Rounds Action …

moviesonline A Bug’s Life

Animation …

100,000,000 more rows

… … …

moviesonline Office Space

Comedy …

moviesonline Platoon War …

50,000,000 more rows

… … …

moviesonline WALL-E Animation …

Page 17: Windows Azure Tables and Queues Deep Dive

Server A

Movie Listing – Solution 1

> Single partition - Entire table served by one server

> All requests served by that single server> Does not scale

PartitionKey(Account name)

RowKey(Title)

Category …

moviesonline 12 Rounds Action …

moviesonline A Bug’s Life

Animation …

100,000,000 more rows

… … …

moviesonline Office Space

Comedy …

moviesonline Platoon War …

50,000,000 more rows

… … …

moviesonline WALL-E Animation …

ClientClient Request

Request

Request

Request

17

Page 18: Windows Azure Tables and Queues Deep Dive

Movie Listing – Solution 2

PartitionKey(Category)

RowKey (Title)

Action Fast & Furious

… 10000 more Action movies

Action The Bourne Ultimatum

… 100000 more Action & Animation movies

Animation Open Season 2

… 100000 more Animation movies

Animation The Ant Bully

Comedy Office Space

… 1000000 more Comedy & SciFi movies

SciFi Star Trek

… 100000 more SciFi & War movies

… 100000 more War movies

War Defiance

> All movies partitioned by category> Allows system to load balance hot partitions> Load distributed> Better than single partition

ClientClient Request

Server A

Request

Request

Request

18

Server B

Request

Request

Request

Request

PartitionKey(Category)

RowKey (Title)

Action Fast & Furious

… 10000 more Action movies

Action The Bourne Ultimatum

… 100000 more Action & Animation movies

Animation Open Season 2

… 100000 more Animation movies

Animation The Ant Bully

Comedy Office Space

… 1000000 more Comedy & SciFi movies

SciFi Star Trek

… 100000 more SciFi & War movies

… 100000 more War movies

War Defiance

Page 19: Windows Azure Tables and Queues Deep Dive

Key Selection: Case Study 2

> Log every transaction into a table for diagnostics> Scale Write Intensive Scenario> Logs can be retrieved for a given time

range

19

Page 20: Windows Azure Tables and Queues Deep Dive

Logging - Solution 1> Timestamp as Partition Key

> Looks like an obvious choice> It is not a single partition as time moves

forward > Append only> Requests to single partition range> Load balancing does not help> Server may throttle

PartitionKey(Timestamp)

Properties

2009-11-15 02:00:01

2009-11-15 02:00:11

100000 more rows …

2009-11-17 05:40:01

2009-11-17 05:40:01

80000 more rows …

2009-11-17 12:30:00

2009-11-17 12:30:01

ApplicationsClientReques

t

Server A

Request

2009-11-17 12:30:01

Request

2009-11-17 12:30:02

Request

2009-11-17 12:30:03

Server B

20

Page 21: Windows Azure Tables and Queues Deep Dive

Server A

Server B

PartitionKey(ID_Timestamp)

Properties

01_2009-10-12 05:10:00

… …

100000 more rows …

09_2009-11-15 12:31:00

… …

20000000 more rows …

10_2009-10-05 05:10:10

5000000 more rows …

… …

900000 more rows …

19_2009-11-17 12:20:02

ApplicationsClientReques

tReques

tReques

tReques

t

Logging Solution 2 - Distribute "Append Only”

> Prefix timestamp such that load is distributed> Id of the node logging> Hash into N buckets

> Write load is now distributed > Better throughput> To query logs in time range

> Parallelize it across prefix values

15_2009-11-17 12:30:01

09_2009-11-17 12:30:22

19_2009-11-17 12:30:10

01_2009-11-17 12:30:01

21

Page 22: Windows Azure Tables and Queues Deep Dive

Key Selection: Query Efficiency & Speed> Select keys that allow fast retrieval> Reduce scan range> Reduce scan frequency

22

Page 23: Windows Azure Tables and Queues Deep Dive

Single Entity Query

Server A

PartitionKey(Category)

RowKey (Title)

Action Fast & Furious

… 10000 more Action movies

Action The Bourne Ultimatum

… 100000 more Action & Animation movies

Animation Open Season 2

… 100000 more Animation movies

Animation The Ant Bully

Comedy Office Space

… 1000000 more Comedy & SciFi movies

SciFi Star Trek

… 100000 more SciFi & War movies

… 100000 more War movies

War Defiance

Client

Server B

> Where PartitionKey=‘SciFi’ and RowKey = ‘Star Trek’

> Efficient processing> No continuation tokens

23

Request

Result

Page 24: Windows Azure Tables and Queues Deep Dive

Client

Server A

Server B

Table Scan Query

Request

PartitionKey(Category)

RowKey(Title)

Rating

Action Fast & Furious 5

… 999 more movies rated > 4

… Action and Anim. movies here with rating < 4

Animation A Bug’s life 2

… 100 more movies < 4 here

Animation The Ant Bully 3

Comedy Are we there yet? 2

… More movies here …

Comedy Office Space 5

… 800000 more movies here

Drama A Beautiful Mind 5

… 1200000 more movies here

War Defiance 4

Cont.

> Select * from Movies where Rating > 4> Returns Continuation token

> 1000 movies in result set> Partition range boundary

> Serial Processing: Wait for continuation token before proceeding

Request Cont.

Cont.

Request Cont.

Cont.

24

Returns 1000 movies

Partition range boundary hit

Return continuation

Page 25: Windows Azure Tables and Queues Deep Dive

Client

Server A

Server B

Make Scans Faster

Request

PartitionKey(Category)

RowKey(Title)

Rating

Action Fast & Furious 5

… More movies here …

Comedy Office Space 5

… More movies here …

Documentary

Planet Earth 4

… More movies here

Drama Seven Pounds 4

Horror Saw 5 3

… More movies here …

Music 8 Mile 2

… More movies here …

SciFi Star Trek 5

… More movies here …

Cont.

> Split “Select * from Movies where Rating > 4” into> Where PartitionKey >= “A” and PartitionKey < “D” and Rating > 4> Where PartitionKey >= “D” and PartitionKey < “I” and Rating > 4> Etc.

> Execute in parallel> Each query handles continuation

Cont.

25

Request

Request

Cont.

Page 26: Windows Azure Tables and Queues Deep Dive

Query Speed1. Fast

> Single PartitionKey and RowKey with equality

2. Medium> Single partition but a small range for RowKey> Entire partition or table that is small

3. Slow> Large single scan> Large table scan> “OR” predicates on keys => no query

optimization => results in scan> Expect continuation token for all except in 1

26

Page 27: Windows Azure Tables and Queues Deep Dive

Make Queries Faster

> Large Scans> Split the range and parallelize queries> Create and maintain own views that help

queries

> “Or” Predicates> Execute individual query in parallel

instead of using “OR”

> User Interactive> Cache the result to reduce scan

frequency

27

Page 28: Windows Azure Tables and Queues Deep Dive

Expect Continuation Tokens – Seriously!> Maximum of 1000 rows in a response> At the end of partition range boundary> Maximum of 5 seconds to execute the

query

28

Page 29: Windows Azure Tables and Queues Deep Dive

Entity Group Transactions (EGT) (new)> Atomically perform multiple

insert/update/deleteover entities in same partition in a single transaction

> Maximum of 100 commands in a single transaction and payload < 4 MB

> ADO.Net Data Service> Use SaveChangesOptions.Batch

29

Page 30: Windows Azure Tables and Queues Deep Dive

Key Selection: Entity Group Transaction> Case Study

> Maintain user account information> Account ID, User Name, Address, Number of rentals

> Maintain information of checked out rentals> Account ID, Movie Title, Check out date, Due date

> Solution 1 – Maintain two tables – Users & Rentals > Handle Cross table consistency

> Insert into Rentals table succeeds> Update to Users table fails> Queue to maintain consistency

30

Page 31: Windows Azure Tables and Queues Deep Dive

Solution 2> Store Account Information and Rental details in

same table> Maintain same PartitionKey to enforce transactions

> Account ID as PartitionKey> Update total count and Insert new rentals using Entity

Group Transaction> Prefix RowKey with “Kind” code: A = Account, R = Rental

> Row key for account info: [Kind Code]_[AccountId]> Row Key for rental info: [Kind Code]_[Title]

> Rental Properties not set for Account row and vice versaPartitionKey(AccountID)

RowKey(Kind_*)

Kind TotalRentals

Name Address CheckOutOn

Title DueOn

… … … … … … … … …

Sally A_Sally Account

8 Sally Field

Ann Arbor, MI

Sally R_Jaws Rental 2009/11/16 Jaws 2009/11/20

Sally R_Taxi Rental 2009/11/16 Taxi 2009/11/20

… … … … … … … … …31

Page 32: Windows Azure Tables and Queues Deep Dive

Best Practices & Summary> Select PartitionKey and RowKey that help scale

> Efficient for frequently used queries> Supports batch transactions> Distributes load

> Distribute “Append only” patterns using prefix to PartitionKey

> Always Handle continuation tokens

> Client can maintain their own cache/views instead of frequent scans> Future Feature - Secondary Index

> Execute parallel queries instead of “OR” predicates

> Implement back-off strategy for retries

32

Page 33: Windows Azure Tables and Queues Deep Dive

Agenda

1. Overview of Windows Azure Tables

2. Patterns and Practices for Windows Azure Tables

3. Overview of Windows Azure Queues

4. Patterns and Practices for Windows Azure Queues

5. Q & A

33

Page 34: Windows Azure Tables and Queues Deep Dive

Windows Azure Queues

> Queue are performance efficient, highly available and provide reliable message delivery> Simple, asynchronous work dispatch> Programming semantics ensure that a

message can be processed at least once

> Access is provided via REST

34

Page 35: Windows Azure Tables and Queues Deep Dive

Queue Storage Concepts

Messages

QueuesAccounts

sally

thumbnailjobs

traverselinks

128 x 128 http://...

256 x 256 http://...

http://...

http://...

35

Page 36: Windows Azure Tables and Queues Deep Dive

Account, Queues and Messages

> An account can create many queues> Queue Name is scoped by the account

> A Queue contains messages> No limit on number of messages stored in a

queue> Set a limit for message expiration

> Messages> Message size <= 8 KB> To store larger data, store data in blob/entity

storage, and the blob/entity name in the message

> Message now has dequeue count 36

Page 37: Windows Azure Tables and Queues Deep Dive

Queue Operations

> Queue> Create Queue> Delete Queue> List Queues> Get/Set Queue Metadata

> Messages> Add Message (i.e. Enqueue Message)> Get Message(s) (i.e. Dequeue Message)> Peek Message(s)> Delete Message

37

Page 38: Windows Azure Tables and Queues Deep Dive

Queue Programming Api

38

CloudQueueClient queueClient = new CloudQueueClient(baseUri, credentials);CloudQueue queue = queueClient.GetQueueReference("test1");

queue.CreateIfNotExist();

//MessageCount is populated via FetchAttributesqueue.FetchAttributes();

CloudQueueMessage message = new CloudQueueMessage("Some content");queue.AddMessage(message);

message = queue.GetMessage(TimeSpan.FromMinutes(10) /*visibility timeout*/);

//Process the message here …

queue.DeleteMessage(message);

Page 39: Windows Azure Tables and Queues Deep Dive

Agenda

1. Overview of Windows Azure Tables

2. Patterns and Practices for Windows Azure Tables

3. Overview of Windows Azure Queues

4. Patterns and Practices for Windows Azure Queues

5. Q & A

39

Page 40: Windows Azure Tables and Queues Deep Dive

21

11

C1

C2

Removing Poison Messages

11

21

340

Producers Consumers

P2

P1

30

2. GetMessage(Q, 30 s) msg 2

1. GetMessage(Q, 30 s) msg 1

11

21

40

10

20

Page 41: Windows Azure Tables and Queues Deep Dive

C1

C2

Removing Poison Messages

340

Producers Consumers

P2

P1

11

21

2. GetMessage(Q, 30 s) msg 23. C2 consumed msg 24. DeleteMessage(Q, msg 2)7. GetMessage(Q, 30 s) msg 1

1. GetMessage(Q, 30 s) msg 15. C1 crashed

11

21

6. msg1 visible 30 s after Dequeue30

41

121112

Page 42: Windows Azure Tables and Queues Deep Dive

C1

C2

Removing Poison Messages

340

Producers Consumers

P2

P1

12

2. Dequeue(Q, 30 sec) msg 23. C2 consumed msg 24. Delete(Q, msg 2)7. Dequeue(Q, 30 sec) msg 18. C2 crashed

1. Dequeue(Q, 30 sec) msg 15. C1 crashed10. C1 restarted11. Dequeue(Q, 30 sec) msg 112. DequeueCount > 213. Delete (Q, msg1)

12

6. msg1 visible 30s after Dequeue9. msg1 visible 30s after Dequeue

30

42

131213

Page 43: Windows Azure Tables and Queues Deep Dive

Best Practices & Summary> Make message processing idempotent

> No need to deal with failures

> Do not rely on order> Invisible messages result in out of order

> Use Dequeue count to remove poison messages> Enforce threshold on message’s dequeue count

> Use message count to dynamically increase/reduce workers

> Use blob to store message data with reference in message> Messages > 8KB> Batch messages> Garbage collect orphaned blobs

43

Page 44: Windows Azure Tables and Queues Deep Dive

Future Features

44

> Allow workers to extend invisibility time> Time to process message unknown at

dequeue time> Worker can extend the time as needed

> Allow longer invisibility time> Long running work items may need more

than 2 hours

> Allow messages to not expire> Large backlogs will not cause messages

to expire

Page 45: Windows Azure Tables and Queues Deep Dive

Takeaways

45

> Table> Scalable & Reliable Structured Storage System> Partitioning is critical to scalability> Entity Group Transactions (new)

> Queue> Scalable & Reliable Messaging System > Dequeue count returned with message (new)

> Use back-off strategy on retries

> Official Storage Client Library (new)

Page 46: Windows Azure Tables and Queues Deep Dive

> Storing and Manipulating Blobs and Files with Windows Azure Storage – 11/18 (4:30 PM)

> Patterns for building Reliable & Scalable Applications with Windows Azure – 11/19 (8:30 AM)

> Automating the Application Lifecycle with Windows Azure – 11/19 (10:00 AM)

Windows Azure Session Alerts!!

Page 47: Windows Azure Tables and Queues Deep Dive

Q&A

Page 48: Windows Azure Tables and Queues Deep Dive

Windows Azure PDC Swag

Page 49: Windows Azure Tables and Queues Deep Dive

YOUR FEEDBACK IS IMPORTANT TO US!

Please fill out session evaluation

forms online atMicrosoftPDC.com

Page 50: Windows Azure Tables and Queues Deep Dive

Learn More On Channel 9

> Expand your PDC experience through Channel 9

> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses

channel9.msdn.com/learnBuilt by Developers for Developers….

Page 51: Windows Azure Tables and Queues Deep Dive

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Page 52: Windows Azure Tables and Queues Deep Dive