Windows Azure Platform Articles From the Trenches

Embed Size (px)

Citation preview

The Windows Azure Platform: Articles from the TrenchesVolume OneEditor and copy and paste guru: Eric Nelson and 15 authors smarter than him 22nd June 2010 (v0.9)

Cover art by Andrew Fryer

Developers have been exploring the possibilities opened up by the Windows Azure Platform for Cloud Computing. This book pulls together great articles from many of those developers who have been active with the Windows Azure Platform to hopefully help others become successful. There are twenty articles in this first volume covering everything from getting started to implementing best practices for elastic applications.

The Windows Azure Platform: Articles from the Trenches TABLE OF CONTENTS

INTRODUCTIONFrom the Editor Would you like to become an author for a future edition? Introduction to the Windows Azure Platform AE Acronyms Explained

66 6 7 8

CHAPTER 1: GETTING STARTED5 steps to getting started with Windows Azure Step 1: Creating an Azure account. Step 2: Provisioning a SQL Azure database Step 3: Building a Web Application for Azure Step 4: Packaging the Web Application for Windows Azure Step 5: Deploying the Web Application to Azure. The best tools for working with the Windows Azure Platform Category: The usual suspects Category: Windows Azure Storage Category: Windows Azure diagnostics Category: SQL Azure Category: General Development

99 9 9 10 11 11 14 14 14 17 18 19

CHAPTER 2: WINDOWS AZURE PLATFORMArchitecting For Azure Building Highly Scalable Applications Principles of Azure Architectures Partition Data Colocation Cache State Distribute Workloads Effectively Maximise Resources Summary The Windows Azure Platform and Cost-Oriented Architecture Cost is important What costs to consider Conclusion De-risking Your First Windows Azure Project Popular Risks Non-Technical Tactics for Reducing Risk Technical Tactics for Reducing Risk

2020 20 20 21 21 21 22 22 23 24 24 24 25 26 26 27 28

2

The Windows Azure Platform: Articles from the Trenches Developer Responsibility Trials & tribulations of working with Azure when theres more than one of you Development Environment Test Environment Certificates When things go wrong Summary Using a Continuous Integration build to achieve an automated deployment of your latest build Getting the right bits Packaging for deployment Deploying Using Java with the Windows Azure Platform Accessing Windows Azure Storage from Java Running Java Code on Windows Azure AzureRunme 29 30 30 30 31 31 31 32 32 32 33 35 35 36 37

CHAPTER 3: WINDOWS AZUREAuto-Scaling Windows Azure Compute Instances Introduction A Basic Approach The Scale Agent Monitoring: Retrieving Diagnostic Information Rules: Establishing When To Scale Trust: Authorising For Scale Scaling The Service Management API Summary Building a Content-Based Router Service on Windows Azure Bing Maps Tile Servers using Azure Blob Storage Azure Drive Guest OS VHD CloudDrive Development Environment Azure Table Service as a NoSQL database Master-Detail structures Dynamic schema Column names as data Table names as data

3939 39 39 39 40 41 42 44 45 46 49 51 51 51 52 53 55 55 55 56 56

3

The Windows Azure Platform: Articles from the Trenches Summary Queries and Azure Tables CreateQuery() Contexts Querying on PartitionKey and RowKey Continuation DataServiceQuery CloudTableQuery Tricks for storing time and date fields in Table Storage Using Worker Roles to Implement a Distributed Cache Configuring the Cache Using the Distributed Cache Logging, diagnostics and health monitoring of Windows Azure Applications Collecting diagnostic data Persisting diagnostic data Analysing the diagnostic data More information Service Runtime in Windows Azure Roles and Instances Endpoints Service Upgrades Service Definition and Service Configuration RoleEntryPoint Role RoleEnvironment RoleInstance RoleInstanceEndpoint LocalResource 57 58 58 59 59 60 60 61 64 68 68 69 71 71 72 72 73 74 74 74 74 75 75 76 76 77 78 78

CHAPTER 4: SQL AZUREConnecting to SQL Azure in 5 Minutes Prerequisite Get a SQL Azure account Working with the SQL Azure Portal Create a database through the Server Administration Configuring the firewall Connecting using SQL Server Management Studio Application credentials Keep in mind the target database

7979 79 79 80 80 81 83 83

CHAPTER 5: WINDOWS AZURE PLATFORM APPFABRIC

85

4

The Windows Azure Platform: Articles from the Trenches Real Time Tracing of Azure Roles from Your Desktop Custom Trace Listener Send Message Console Application Trace Service Service Host Class Service Summary 85 85 86 86 87 88 88

MEET THE AUTHORSEric Nelson Marcus Tillett Richard Prodger Saksham Gautam Steve Towler Rob Blackwell Julin Hanssens Simon Munro Sarang Kulkarni Steven Nagy Grace Mollison Jason Nappi Josh Tucholski David Gristwood Neil Mackenzie Mark Rendle

9090 90 91 91 92 92 92 93 93 93 94 94 95 95 96 96

5

The Windows Azure Platform: Articles from the Trenches INTRODUCTION FROM THE EDITOR Hello all, The Windows Azure Platform is changing the way we architect, implement, deploy and manage solutions. In early 2010 it went live and in the first six months we have already seen an impressively diverse range of solutions developed to take advantage of the services offered. This book pulls together great articles from many of those developers who have been active with the Windows Azure Platform to hopefully help others be successful. There are twenty articles in this first volume covering everything from getting started to implementing best practices for elastic applications. You are not expected to read it in order from start to finish. Instead I would encourage you to head straight to the chapters or the individual articles that look most relevant or interesting. The book was put together in May and early June 2010 which means that it pre-dates the 1.2 release of the Windows Azure SDK. The 1.2 released adds some great new features, especially for Visual Studio 2010 and .NET Framework 4.0 in areas such as debugging and IDE integration. Volume Two of this book will cover off those new features (and more!) Once you have had a chance to look at the articles please give us your feedback at http://bit.ly/azureebook1feedback (It should take less than one minute). Thank you and happy reading.Eric Nelson Developer Evangelist, Microsoft UK Website: http://www.ericnelson.co.uk Email: [email protected] Blog: http://geekswithblogs.net/iupdateable Twitter: http://twitter.com/ericnel

WOULD YOU LIKE TO BECOME AN AUTHOR FOR A FUTURE EDITION? Developers value the sharing of best practices, knowledge and experiences knowledge and experiences such as your own. If you have insight into the Windows Azure Platform then you are a great candidate for becoming an author involved in the next volume of this book as the Windows Azure Platform continues to evolve and broaden. Please email me ([email protected]) with your proposed article(s) and if possible a sample of your work such as a link to your blog.

6

The Windows Azure Platform: Articles from the Trenches INTRODUCTION TO THE WINDOWS AZURE PLATFORM The Windows Azure Platform contains three technologies which can be used individually or together to build solutions which run in the cloud. For the first time you are able to run your code and store your data in Microsoft datacenters and let Microsoft take on some of the responsibility for keeping your solution running great and able to respond to the changing demands of business. Solutions can either run entirely on the Windows Azure Platform or as a hybrid, with some of the solution running on-premise or elsewhere on the Internet. The three key technologies are Windows Azure, SQL Azure and Windows Azure Platform AppFabric: Windows Azure Windows Azure is the cloud services operating system for the Windows Azure Platform. Windows Azure provides developers with on-demand compute and storage to run your code and store your data. Windows Azure supports a consistent development experience through its integration with Visual Studio 2008 and Visual Studio 2010. Windows Azure is an open platform that supports both Microsoft and non-Microsoft languages and technologies. Windows Azure welcomes third-party tools and technologies such as Eclipse, Ruby, PHP, and Python. SQL Azure Microsoft SQL Azure delivers the capabilities of Microsoft SQL Server to Windows Azure applications or applications running outside of the Windows Azure Platform. It can store and retrieve structured, semi-structured, and unstructured data with the advantage of high availability through the storage of multiple copies of your data. It enables relational queries, search, and data synchronization with mobile users, remote offices and business partners. Windows Azure Platform AppFabric AppFabric provides secure connectivity as a service to help developers bridge cloud, onpremise, and hosted deployments. AppFabric comprises Service Bus and Access Control. From simple eventing scenarios to complex protocol tunneling, AppFabric Service Bus gives developers the flexibility to choose how their applications communicate; addressing the challenges presented by firewalls, NATs, dynamic IP, and disparate identity systems. AppFabric Access Control enables simple, secure authorization for RESTful web services that federate with a variety of identiy providers.

There are many articles, videos and screencasts designed to help you get up to speed with the Windows Azure Platform and a great place to start is http://bit.ly/startazure. We also have a Getting Started chapter within this book.

7

The Windows Azure Platform: Articles from the Trenches AE ACRONYMS EXPLAINED If you are new to the Windows Azure Platform then you may need a little help with some of the acronyms and industry terms used in this book. REST and RESTful - Representational State Transfer. A style of software architecture to enable clients and servers to interact. WCF Windows Communication Foundation. A technology shipped initially in .NET Framework 3.0 to allow communication to take please between code running in different locations. Cloud Computing running of code and storage of data off-premise. (Also see the 100+ alternative definitions of Cloud Computing e.g. http://en.wikipedia.org/wiki/Cloud_computing ) Elastic Computing as more processing power is needed or as more data needs to be stored, elastic computing (in our case the Windows Azure Platform) promises to rapidly respond to those demands and provision out additional compute and storage resources. PaaS Platform as a Service is one approach to Cloud Computing that favors abstraction and simplicity over flexibility e.g. the Windows Azure Platform. IaaS Infrastructure as a Service is one approach to Cloud Computing that favors flexibility over abstraction and simplicity e.g. Amazon Web Services. Codename Dallas a 4th member of the Windows Azure Platform, currently in CTP. http://www.microsoft.com/WindowsAzure/dallas/ CTP Community Technology Preview. In simple terms not quite as solid as a traditional Beta

8

The Windows Azure Platform: Articles from the Trenches CHAPTER 1: GETTING STARTED 5 STEPS TO GETTING STARTED WITH WINDOWS AZURE By Jason Nappi Getting started with a new technology can be daunting, but generally once you get going things become familiar and learning accelerates. Therefore, Id like to focus on providing a few of the basic steps that I recently went through in the hope that it will both answer some of the basic questions and knock down some of the barriers to accelerated learning. The following are some of the primary design considerations for what I think of as a typical business application, and the implications of building those same types of applications in the Azure cloud. STEP 1: CREATING AN AZURE ACCOUNT. The first step, as you might imagine, is to set up an Azure account. Since Windows Azure is a cloud service, youll need to create an account in the cloud, and provision a cloud environment. You can create an Azure account at the Windows Azure Developer portal. This is a pretty straightforward registration process that will require you to create a Windows Live ID if you dont already have one and will require a credit card. At the conclusion of the registration process you should have access to Windows Azure, SQL Azure and AppFabric. At this point you havent created any cloud services; youve only created an account under which the services you create can be provisioned and deployed. STEP 2: PROVISIONING A SQL AZURE DATABASE This step may not be required by everyone, but most of the applications Ive built have been database driven. Given that, whether creating a new application or moving an existing one to the cloud, I think its going to be a fairly common question to ask where the database lives and how you connect to it. The reasonable answer is that if my application is going to be hosted in the cloud, my database needs to be in the cloud too. The Windows Azure Platform provides Windows Azure Storage as well as SQL Azure for storing data. SQL Azure is most similar to the relational databases of the typical business application, so while Azure Storage may have scalability and cost advantages, SQL Azure provides the more familiar paradigm. Naturally Im inclined towards SQL Azure to get started. In order to create my cloud database Ill need to return to the Azure account that I set up in step 1 and navigate to the SQL Azure section of the portal https://sql.azure.com. To create a SQL Azure server, youll need to provide a username and password and the SQL Azure Developer Portal will create a server using a generated unique name similar to crkvq7vdhu.database.windows.net. With the SQL Azure server created, you can now create the database. There is also an additional requirement that you configure firewall rules to allow access. Again, for the sake of simplicity, you can just grant your local machines IP address access to the SQL Azure server.

9

The Windows Azure Platform: Articles from the Trenches

Lastly, you might be wondering, as I did, whether the newly created SQL Azure database is accessible via the familiar SQL Server Management Studio Tools. I was able to successfully connect after downloading SQL Server Management Studio 2008 R2. STEP 3: BUILDING A WEB APPLICATION FOR AZURE Having provisioned our cloud database and proven that you can connect to it with familiar SQL Server Management studio tools, and assuming youve created the tables required by your application, youre ready to begin building your application. In order to do so youll need to install the Windows Azure SDK and the Windows Azure Tools for Microsoft Visual Studio 1.1. The good news about both of these is that they support Visual Studio 2008 and Visual Studio 2010. Once you fire up Visual Studio youll notice a new project template for Windows Azure Cloud Service. After choosing the cloud service template you will be prompted to choose from one of the cloud service roles; Web, Worker and WCF Service Roles. Assuming youve chosen ASP.NET Web

10

The Windows Azure Platform: Articles from the Trenches Role, a solution containing two projects, a cloud services project and the familiar ASP.NET Web project, will be created. The only real difference between a standard ASP.NET web project and the ASP.NET Web Role project is the existence of a WebRole.cs file. The WebRole.cs serves as the entry point for Azure. When you hit F5 your Azure application starts up and runs inside the development Fabric. The Development Fabric simulates the Windows Azure cloud environment enabling you to run, test and debug Azure applications on the desktop! STEP 4: PACKAGING THE WEB APPLICATION FOR WINDOWS AZURE Packaging up the application for publishing to Azure turns out to be fairly simple. From within Visual Studio you can right click on the Cloud Services project and choose Publish from the context menu. This will package the web application into a .cspkg file, and also create the ServiceConfiguration.cscfg file. These two files are all you need to deploy your application to Windows Azure. STEP 5: DEPLOYING THE WEB APPLICATION TO AZURE. Now that youve packaged your ASP.NET Web Role, youll need to return to the Windows Azure account you created in Step 1 and create your Windows Azure service. Under the Windows Azure tab choose new serviceHosted Service and provide a name and description for your new cloud service. Once the Service is created therell be two hosted service locations, staging and production. Under each will be a Deploy button. Choose Deploy under Staging. This will bring up a screen asking for the two files created in Step 4. Provide both files, and deploy. After deploying the package and the configuration youll be provided with a unique url for accessing your application. Now youll also see that you have the ability to Run the service.

11

The Windows Azure Platform: Articles from the Trenches

The application wont be accessible via the url until you Run it, so press Run, and wait for it, wait for it, wait for itit takes a while to provision the Windows Azure infrastructure for your application, but once you get the green light you should be good to go.

12

The Windows Azure Platform: Articles from the Trenches

These are just a few of the baby steps Ive taken to become familiar with Windows Azure. With these steps Ive been able to demonstrate that developing for Windows Azure is largely the same development experience that Im accustomed to. However, one of the more intriguing considerations when building for Windows Azure is the potential use of Windows Azure Storage as a data store instead the more conventional relational database provided by SQL Azure.

13

The Windows Azure Platform: Articles from the Trenches THE BEST TOOLS FOR WORKING WITH THE WINDOWS AZURE PLATFORM By Sarang Kulkarni A platform is known by the tooling available around it! Much clichd but still holds true. Windows Azure, though a fairly nascent cloud platform is aptly supported by some fantastic tooling which make development fun and a developers life easy. Let us get the usual suspects out of the way first to make way for some more interesting kids on the block, many of which I cannot do without. CATEGORY: THE USUAL SUSPECTS Microsoft Visual Studio 2010 Visual Studio 2010 (VS2010) is a stable development platform for Windows Azure. Though there are very few changes specific to Azure when compared with VS2008, the overall development experience is definitely superior. Windows Azure VMs support .Net Framework 4.0 from OS Version 1.2 and therefore it makes sense to use VS2010 to take advantage of the new features of .Net 4.0 in the cloud. As always, the Express edition is free.

Microsoft SQL Server Management Studio 2008 R2The R2 release is recommended for working with SQL Azure. The biggest advantage being the comfort of an SQL IDE we have grown up with. I dont think I need to wax poetic about this one, this is Bread and Butter. Again Express edition is free and recommended as it serves most of the needs. Download it from: http://www.microsoft.com/downloads/details.aspx?familyid=56AD557C-03E6-4369-9C1DE81B33D8026B&displaylang=en.

User Accounts and Local Security Policy Control Panel appletsI know theres nothing specific to Azure here. But it comes very handy to have a user with permissions as laid out at http://msdn.microsoft.com/en-us/library/dd573355.aspx to avoid any surprises related to user rights while running in the fabric. CATEGORY: WINDOWS AZURE STORAGE What: Cerebrata - Cloud Storage Studio Why: Cerebrata Cloud Storage Studio (CSS) is a WPF based client for managing Azure Storage, as well as hosted applications. CSS started as a commendable effort by a small firm to provide an intuitive visual access to the Azure Storage putting the Storage APIs to good use. It now stands as a one stop solution to manage everything under the Azure Storage, as well as a lot of things in the hosted

14

The Windows Azure Platform: Articles from the Trenches applications.

Figure 1: Cloud Storage Studio - Connect to Azure Account

You can design a table schema in CSS, perform CRUD operations on existing tables, download/upload table contents to/from the disk and filter table contents. Basic querying support is also provided which supports the WCF Data Services (formally ADO.NET Data Services) query syntax. Linq query support would have been a welcome add-on. Blob storage is a forte of CSS and all possible operations on Blobs and Containers are available. You can create containers, configure access policies, list blobs in a container replete with the folder structure, upload/download page/block blobs, rename, copy and move blobs, create and view blob snapshots (Very useful), create signed URL for a blob. MIME type configuration support is icing on the already nice cake. My only grudge is the very basic breadcrumb while navigating the container structure. CSS also features a simple yet effective service management UI. The design closely resembles that of the actual azure developer portal. The same features are offered plus a few more. The regular service management operations like connecting to hosted services, view, deploy, delete services, swap deployment slots, manage API certificates and manage affinity groups are available. A very useful feature we find here is a nifty little checkbox at the bottom of the create service deployment dialog which reads Automatically run the deployment after creation a nice touch.

15

The Windows Azure Platform: Articles from the Trenches

Figure 2: Cloud Storage Studio - Deploy a Service

It costs a totally worthwhile 60$ per license. Notable alternatives are Cerebratas own CSS/e https://onlinedemo.cerebrata.com/cerebrata.cloudstorage/default.aspx which is a Silverlight application providing very basic but useful Storage Service administration the open source Azure Storage Explorer http://azurestorageexplorer.codeplex.com/ Finally, the far from perfect yet still useful open source alternative Azure MMC Snap-in http://code.msdn.microsoft.com/windowsazuremmc. Azure MMC in its second version and covers almost all bases as the Cloud Storage Studio and deserves a worthy mention.

Figure 3: Windows Azure MMC

16

The Windows Azure Platform: Articles from the Trenches What: LINQPad http://www.linqpad.net/ Why: It would not be an overstatement to term LinqPad by Joseph Albahari to be the best querying scratchpad available for Linq. LINQPad can query a varied set of data sources. Of particular interest to this discussion are SQL Azure, WCF Data Services (Think codename Dallas) and Windows Azure Table Storage. Yes Table storage! LINQPad steps in where Cloud Storage Studio stops being adequate - the querying capabilities are superior and the interface more powerful.

Figure 4: LinqPad - Sample Query on the WADPerformanceCounters table

As usual some of the best tools come free and LinqPad surely fits the definition. There is also a pro version available with some bells and whistles like auto-complete, Visual Studio integration etc. CATEGORY: WINDOWS AZURE DIAGNOSTICS What: Cerebrata Azure Diagnostics Manager http://www.cerebrata.com/Products/AzureDiagnosticsManager/Default.aspx Why: Azure diagnostics has taken some time to reach the final form we see it in today. There are few tools which provide the comfort of an Event Viewer or a comprehensive management dashboard for working with the diagnostic data. Azure Diagnostics Manager (in public beta at the time of writing) attempts to achieve just that. The feature set is fairly comprehensive covering the following: You can either connect to an Azure storage account to read the diagnostics information and find the deployments from there and connect to the listed deployments or choose to connect directly to a subscription and get a list of hosted services to monitor. The Dashboard provides a birds eye view of all the diagnostic information collected. One may choose to view Event Viewer, Trace Logs, Infrastructure Logs, Performance Counters, IIS Logs, IIS Failed Request Logs, Crash Dumps and On Demand Transfer.

17

The Windows Azure Platform: Articles from the Trenches If you have only deployed a service and are collecting none of these, fret not. Azure Diagnostic monitor also provides access to the diagnostic monitor inside your Roles as well as individual role instances through the Remote Diagnostics API. With this you can enable/disable any of the diagnostic information being collected or you can alter the verbosity/frequency.

Figure 5: Azure Diagnostics Manager - Performance Counter Graphs

CATEGORY: SQL AZURE What: SQL Azure migration wizard http://sqlazuremw.codeplex.com/ Why: As most of us working with cloud solutions might have already noticed, the largest chunk of the work coming to the System Integrators is the migration of existing applications to cloud. One of the key aspects of this is database migration. SQL Azure migration wizard helps simplify database migration. With the SQL Azure Migration Wizard we can analyze scripts for SQL Azure compliance, generate scripts and can migrate databases schema and data. Migration is supported from SQL Server to SQL Azure, SQL Azure to SQL Server and SQL Azure to SQL Azure. Even in its 3.2.2 version it still has its share of quirks but is vastly improved and great for the mundane tasks in DB migration.

18

The Windows Azure Platform: Articles from the Trenches CATEGORY: GENERAL DEVELOPMENT What: Fiddler http://www.fiddler2.com/fiddler2/ Why: Fiddler is a Web Debugging proxy. It allows us to inspect all incoming and outgoing HTTP(S) traffic on a machine. This is particularly helpful while working with the Azure Storage, Azure Service Management API, Remote Diagnostics Manager API and anything REST. Looking at the HTTP traffic gives an insight into how the Requests/Responses are constructed, what Responses are received and a host of other information that every web service developer/consumer will find handy.

Figure 6: Fiddler Statistics

Fiddler scripting engine can be used to filter in/out requests and/or responses and also issue preconfigured responses. Fiddler can also target specific processes to filter traffic only from those processes. Fiddler provides an API which can be used in a .Net application to programmatically track network traffic and use almost all of Fiddlers features. This has enabled some nifty Fiddler Extensions like Watcher - A Passive Security Audit tool http://websecuritytool.codeplex.com/ , Chad Oswalds Request to Code http://www.chadsowald.com/software/fiddler-extension-request-to-code which gives the required code to issue captured http requests and the JSON Viewer http://jsonviewer.codeplex.com/ which visualizes JSON objects.

19

The Windows Azure Platform: Articles from the Trenches CHAPTER 2: WINDOWS AZURE PLATFORM ARCHITECTING FOR AZURE BUILDING HIGHLY SCALABLE APPLICATIONS By Steven Nagy Two key reasons organisations move to the cloud are to reduce cost and leverage economies of scale. Unfortunately not every type of application is suited to the cloud, and more often than not, those that are suited for the cloud are not architected for scalability. Further, the Windows Azure Platform has a pricing model that if not considered during your architecture phase, can negate the cost benefits of moving to the cloud to begin with. This article will address the key things to consider when architecting highly scalable applications that are cost-optimised for the Azure platform. PRINCIPLES OF AZURE ARCHITECTURES The Windows Azure Platform already provides elasticity, redundancy, and abstractions from the distributed platform on which it is run. This gives us a flying head start when designing systems for the cloud, but there are still key measures we need to take to ensure our application doesnt become its own worst enemy. Here we define five key tenets to keep in mind throughout the design and implementation phases of your project. PARTITION DATA Data partitioning is not a new concept 1. Traditionally it has helped us break up massive databases into smaller more manageable pieces, and to improve query performance by splitting unrelated data into different partitions. In scalable applications it is important for those same reasons, but also allows us to scale more effectively; imagine serving 500 requests per minute on a single database versus 50 requests per minute across 10 databases. Furthermore, storage is cheap. Consider Sql Azure pricing versus Azure Table Storage 2 for 1Gb storage: $10 and $0.15 per month respectively. Both are at least 3 times redundant. However not only is Azure Table Storage cheaper, it has inbuilt partitioning mechanisms that allow you to allocate every single entity (row) of data to a horizontal partition (or shard 3) based on the partition key you provide. In Table Storage, each partition is a physically different storage node, which means queries and requests can scale extremely efficiently. If you dont have complex relational queries, this is the ideal choice. Denormalising your data can help immensely by removing those relationships and allowing ease of partitioning. This is essentially the premise of the NoSql movement 4. You should also consider data duplication for further performance increases. Consider a search function for customers by age demographic or by city; by having two copies of the data in different

1 2

http://msdn.microsoft.com/en-us/library/ms190787%28v=SQL.100%29.aspx http://www.microsoft.com/windowsazure/pricing/ 3 http://en.wikipedia.org/wiki/Shard_%28database_architecture%29 4 http://en.wikipedia.org/wiki/Nosql

20

The Windows Azure Platform: Articles from the Trenches partitions, your query and retrieval time is highly efficient. The flip side to this approach is the added complexity to managing multiple copies of data. Partitioning support in Azure can be summarised as follows: Table entities are horizontally partitioned on partition key Blobs are partitioned based on their container Queues are partitioned on a per-queue basis Sql Azure supports no partitioning

Vertical partitioning is not supported by default however it makes sense to store smaller amounts of data together when the additional fields are not needed on the majority of requests. COLOCATION Sql Azure, Azure Storage, Azure Compute roles, and the AppFabric all have bandwidth costs for data moving in and out of the data centre. It makes sense to keep this in mind when building our applications. Azure already lets us choose our data centres and more importantly, we can co-locate components of our system via Affinity Groups such that network traversal is minimal and faster. Luckily this is a deployment consideration and not so important with up front design. CACHE A more important consideration is the various opportunities to utilise caching mechanisms. There are many ways that cache can be harnessed to minimise transactions; from end user http requests, for underlying data stores, or memoization5 purposes. When almost everything in the platform is accessible via a REST interface, it pays to invest effort into caching. Some cache concepts to consider are: Client side timed cache content that expires after a certain amount of time, preventing client browsers from requesting a page, serving a local copy instead Entity Tags6 (ETags) - Allow you to specify a version in a http header field; server can indicate the version has not changed, in which case no other data is exchanged, otherwise can return all the data for that request ASP.Net Page level Cache Distributed Cache7 - has multiple nodes that either all share the same content (shared everything) or have unique sections of the cache (shared nothing); shared everything distributed caches work well in Azure because of the throwaway nature of commodity hardware and ease of scale

STATE

5 6

http://en.wikipedia.org/wiki/Memoization http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html 7 http://msdn.microsoft.com/en-us/magazine/dd942840.aspx

21

The Windows Azure Platform: Articles from the Trenches State has often been cast as the enemy of concurrent programming and the same applies at higher levels of abstraction as well, such as multiple compute instances. Mutable state requires locking and tracking in concurrent environments, which adds overhead and complexity to applications. Therefore reducing, or even removing state is an ideology worth pursuing. Sometimes state is specific to a single user, such as session state. Load balancers in the Azure data centres are round-robin, therefore as soon as you have more than one web front end you can no longer store session state in process (default); if session state is critical to your application, look to move it to Sql Azure or Table Storage instead. However session state is typically abused and is generally not actually required for the situations in which it is used. As an alternative to sessions, consider claim based security, such that any page request is accompanied by a set of claims. The AppFabric Access Control Services can assist with this. DISTRIBUTE WORKLOADS EFFECTIVELY Typically when multiple sources need to access a resource there is a level of contention. Locks and leases need to be taken and other threads are blocked until contention is resolved. As with state, this problem exists in all forms of concurrent programming, and is as important in multi-instance work sharing scenarios. Worker roles need to pick up items for processing, but when there are multiple instances of the same worker role, how do we ensure that each instance does not pick up the same work item? The Asynchronous Work Queue Pattern is one such solution. By providing a robust, redundant queuing mechanism that guarantees unique distribution of work items, the workers are ignorant of leases and locks and can focus on the job of processing work items. Such a queue will be reusable for many different work types, and the Windows Azure Storage Queue service is an ideal candidate. There are other messaging architectures that allow us to decouple our components. AppFabric allows a NetEventRelayBinding for Publish/Subscribe scenarios, for example. MAXIMISE RESOURCES One could argue that if your CPU is not at 100% it is being underutilised. In Azure you pay for the core regardless of usage, so it makes sense to get the most bang for your buck. When using worker roles, multi-threaded architectures are often forgotten. Since adding another instance means an additional hourly cost, first ensure you are getting the most out of your current instances. If your worker (or web role for that matter) has lots of IO work, it makes sense to use multiple threads. Auto-scaling resources is worth investigating also. Typically an IT department will maintain enough servers to cope with their peak periods; consider instead starting at trough capacity, and use autoscaling functionality to add instances dynamically. When load starts to taper off, start scaling down, cutting costs as you do.

22

The Windows Azure Platform: Articles from the Trenches Currently you can utilise content delivery services (CDN) to push blobs out to localised edges. This will help improve latency for your customers. Also consider what could qualify for blob storage; essentially anything static is a contender: PDF, Word documents Videos Website images Website CSS and JavaScript libraries Any static HTML website pages Silverlight files (XAP)

Blob storage currently allows blobs to be stored in the root container. This feature was specifically included so that Silverlight applications running from blob storage could place a cross domain policy file at the root of the URL namespace (a requirement for cross domain policy files). SUMMARY While not extensive, this article gave you a brief overview of some key principles to keep in mind when architecting applications to run on the Windows Azure Platform. By following these guidelines you should be able to achieve core objectives of scalability and cost recovery in the cloud.

23

The Windows Azure Platform: Articles from the Trenches THE WINDOWS AZURE PLATFORM AND COST-ORIENTED ARCHITECTURE By Marcus Tillett COST IS IMPORTANT Cost-orientated development is nothing new. A low cost approach to building an application or product is desirable but the methodology used to achieve this is not always very sophisticated. When considering a cloud platform such as Azure the cost implications of the chosen architecture can be significant and require a more sophisticated approach. While a traditional on premise or hosted architecture may not consider cost as a significant factor, cost is an area that receives significantly more focus for Azure. There are a range of costs that need to be considered; these costs need to be considered in the context of Azure and of the end to end development and application lifecycle management processes. WHAT COSTS TO CONSIDER The development process can be a significant cost consideration for Azure. There is a continuum of development strategies for Azure; from, at one extreme, using the Azure environment for development, to the other extreme, developing without any reference to Azure. There are cost implications and significant other pros and cons across this continuum. As an example, consider the use of software factories. With a software factory that uses a strict assembly process, the cost of using the production platform may be prohibitive due to the expense of both the platform and training required. These concerns would drive a cost-oriented architecture where all Azure specific components are abstracted from the developer or potentially replaced with non-Azure components. While this may be an extreme example, it does highlight one of several areas to be considered. Summary Cost is much more of an architectural consideration for Azure than for a traditional on premise or hosted solution. Cost implications of the chosen architecture can be significant. Costs should be considered in the context of the end to end development and application lifecycle management processes. Model costs for the chosen architecture but most importantly test the model.

Another significant topic is the methodology applied to the migration of an existing application or the consideration for setting up data required by a new application. Migration and set up need to include both the application and the data. The time, processes and procedures needed to transfer large volumes of data or complex data, in particular, may be a significant undertaking. With the potential complexity of managing changes to a live data source, the total business cost of the chosen approach can be a critical factor. The cost implication of the platform itself is, perhaps, the obvious area to necessitate a cost-oriented architecture. It is natural to be drawn to, for instance, the dramatic price difference for data storage between SQL Azure and Windows Azure storage. While this may be critical to some applications, it is better on balance to construct a solid architecture as this will provide the best long term approach than initially focusing on cost. This should be supported by modelling the costs of all the components

24

The Windows Azure Platform: Articles from the Trenches of the application. However, it is even more important to test this model for the most cost critical aspects of the application. Thereby providing an understanding of how the application design and the charging mechanisms of Azure impact the cost model. With this information the architecture can be reviewed for significant cost savings. For any aspects that are cost critical, monitoring should be included in the final application and used to tune the system while ensuring that the evolution of Azure and the application are analysed for significant cost implications. Indeed monitoring the whole system as a means to verify costs and SLA is another architectural consideration. As a way to augment the full cost modelling process, there are some scenarios where the cost of the platform suggests a cost-orientated architecture. One of these is multi-tenanting of an application where there are high tenant numbers. A basic on premise or hosted server model with a pair of servers can enable the creation of a separate IIS web site and SQL Server database for each tenant. This model supports 10s or perhaps 100s of tenants for near same cost as a single tenant. Translated the same architecture to Azure might consist a Windows Azure Web Role and a 1GB SQL Azure database. This would equate to an approximately monthly cost of US $100 per tenant but the cost of this Azure architecture scales linearly with tenant numbers. This is not to state that Azure is not suitable for multi-tenanted applications, but that where cost is a critical factor for the application a different architectural approach may be required. CONCLUSION Whether the considerations described here could be termed cost-driven8 or cost-oriented architecture9,10, the terminology is less important than the realisation that cost is much more of an architectural consideration for Azure than for a traditional on premise or hosted solution.

8

Lessons Learned: Building Multi-Tenant Applications with the Windows Azure Platform http://microsoftpdc.com/Sessions/SVC33 9 Thinking of... Delivering Solutions on the Windows Azure Platform? http://www.amazon.co.uk/ThinkingDelivering-Solutions-Platform-Questions/dp/0956155634/ 10 Windows Azure Platform for Enterprises http://msdn.microsoft.com/en-us/magazine/ee309870.aspx

25

The Windows Azure Platform: Articles from the Trenches DE-RISKING YOUR FIRST WINDOWS AZURE PROJECT By Simon Munro Developer enthusiasm for building solutions based on Azure is not always shared by business. While it is great (and perhaps obvious to us) that the cloud is the way of the future some individuals and organizations and vendors are ready for the change while others are not. Not all vendors have technologies for the cloud and many businesses, products, industries and jobs will go as the cloud wave washes them out to sea. Vendors are scrambling for attention and pushing their biased marketing oriented opinions through the biggest dinosaurs of all the print media, that culturally could not even cope with the changes brought on by the Internet. Most anti-cloud and vendor bashing opinion plays on fear and its business cousin risk, where the urge is to maintain the status quo in our (currently) risk-averse environment. It is unsurprising then that the people that we need to make decisions about cloud computing in our own organizations are confused, wary and reluctant to make a commitment to our latest idea of running our solution on Azure. The term the cloud has become synonymous with the web and is indistinct from cloud computing platforms that we are interested in the unfortunate side effect being that the behaviour of Google, Facebook, Apple and other web-consumer facing properties that willy-nilly change terms of service and sell personal data for profit casts a shadow over business oriented cloud computing services. While the dust may settle at some point in future, if we want to build a solution on Azure any time soon, we will have to take responsibility for helping business understand the issues in order to gain their support. While we may prefer to deal only with technical issues, the current reality is that in most environments we have to proactively discuss the perceived risks and demonstrate that we, as well as the Microsoft and the Windows Azure platform, are actively managing and reducing risk. POPULAR RISKS Risks to data are by far the most publicised because once data is in databases that are outside of an organizations locked-down data centre a degree of control and authority over the data is lost. Unlike students that go and live in co-ed dorms, data does not get drunk and put pictures of itself up on Facebook when it leaves home, but the suspicion still remains that off premise data is a high risk. While the risk to data may increase, the actual risk, in most cases, is greatly exaggerated and manageable. Process related risks are also well known, centred on the involvement of other parties in the operational aspects of the solution. No longer can business dictate service levels or even have confidence in an external supplier of services that they may have had with their own internal IT. Like with data, there are real issues here that have fairly complex contractual ramifications as customers attempt to reduce vendor lock-in, guarantee service levels and maintain operational, security, performance and other standards. COVERT RISKS While mainstream CIO information sources popularise some risks by extensive coverage, there are many risks that are just as real but less well known, often due to their more technical nature.

26

The Windows Azure Platform: Articles from the Trenches The most obvious is the lack of skills and experience in creating secure, reliable and performant cloud computing solutions. This also related to the problem of development engineering costs that could be higher than simply throwing hardware at performance bottlenecks. Even Microsoft, as our trusted provider of platforms and tools still has risks embedded within Azure. The lack of on-premise alternatives to cloud technologies such as Azure tables and queues makes the commitment to the platform quite high (a kind of vendor lock-in) and the tooling is still immature and unable to easily support accepted engineering practices such as continuous integration (see Using a CI build to achieve an automated deployment of your latest build by Grace Mollison) . NON-TECHNICAL TACTICS FOR REDUCING RISK While ultimate responsibility for managing risk falls to project managers and other people within the organization, the identification of risk still remains the responsibility of everybody on the team. By downloading this book you have more knowledge of cloud computing than many of your coworkers, so before getting into the technical aspects, you will need to shoulder additional responsibility and deal with some aspects of reducing risk that do not involve code. CHOOSE THE CORRECT APPLICATION Choose something simple that is better suited to cloud computing, such as one that is public facing and may have demand peaks. Build on those successes before tackling applications that contain sensitive data, integrate with a lot of other systems, are a migration of an existing legacy system or contain a lot of traditional database storage and reporting. ENGAGE EARLY Even if your project is a low profile skunk works development, you need to engage with legal, compliance, operations, finance, audit and other parts of the business sooner than usual. Normally we would not worry about throwing up a new website onto our existing data centre, but if you surprise people with a rogue cloud computing application it may get shot down. UNDERSTAND THE PRICING AND OPERATIONAL MODEL As much as it may look simple on the surface, digging deeper into the pricing, billing, SLAs and related aspects of cloud computing platforms can become complicated, with broad reaching impacts on legal positions, compliance and interdepartmental feuds. You have to at least put the Azure prices in a spreadsheet with your estimated requirements and put an annotated printout of the SLA in your project sponsors hands. UNDERSTAND THE IMPLICATIONS While it may be unnecessary to do a full threat model, you need to understand the possible financial, reputational and other risks if your application is compromised or the data gets lost.

27

The Windows Azure Platform: Articles from the Trenches Understanding the effects of loss should influence your approach to what data is stored on the cloud, for how long and whether it is moved to on-premise storage. FAMILIARISE YOURSELF WITH ON-PREMISE RISKS Because cloud computing is seen to have security risks, the focus on security often means that the solution is more secure than the on-premise counterparts. Whenever defending the risks of cloud computing make sure that you compare them to the existing everyday risks of the existing onpremise platform. Not all solutions, networks and other infrastructure can actually deliver the availability and security that they promise. UNDERSTAND THE APPETITE FOR RISK Culturally, startups can absorb cloud computing risks as part of their overall risk exposure compared more risk averse organizations such as banks that are, at least this year, less likely to absorb additional risk. More mature organizations have processes and committees for managing risks and, although it may ultimately be the project sponsors responsibility, you need to get a feel for the ability of the organization to take on risk before you pitch your big idea. TECHNICAL TACTICS FOR REDUCING RISK HOW EXTREME? Microsoft has made it quite simple to take a good ol ASP.NET web application with an underlying SQL database and throw it up onto the Azure cloud with minimal changes. On the other hand, building a well architected solution that has been optimised for a cloud computing environment is more difficult, involved and risky. If your system is being built within a risk averse environment and does not need to be built for the cloud, forgo Azure storage, worker processes, federated identity management and other cloud specific technologies and build a simple solution with web roles and a SQL database. Azure will support you well whichever approach you choose, but you need figure out how much on the fancy new stuff you really need and make those decisions early. DEFINE THE APPROACH TO DATA When it comes to cloud computing risks, data is the most sensitive and active topic and it needs to be addressed early on in the solution design. Fortunately SQL Azure addresses many of the concerns and risks around the NoSQL-like Azure tables by providing a familiar database platform if such familiarity is required, but ultimately Azure storage, caching and other technologies need to be considered in any good Azure architecture. Whatever the bias for storage in the Azure cloud, there is still the issue that the data is in the cloud and it needs to be dealt with in your architecture. There may be a requirement to move or copy data from Azure to an on premise database for reporting, integration with other systems or even just the feeling that the data is safer. MANAGE THE ENGINEERING COST

28

The Windows Azure Platform: Articles from the Trenches Unless you have built a reasonable sized application on Azure and deployed it in a live environment there are going to be unforeseen technical challenges that will present themselves. By reading this book you are clearly on the right track and trying to learn from the experiences of others, but you need to do a lot more than just read or learn on the job. You need to install the tools, write code, deploy, put it under load, scale up, scale down, debug, diagnose and try out a lot of unfamiliar patterns and technologies just to reduce the impact of unforeseen quirks. IMPLEMENT WITH GOOD ENGINEERING PRACTICES The future of your first Azure application is fairly unsure cast your mind out two years and you cannot be sure that your architectural choices were correct, technical components have been added or abandoned, regulations have changed or the attitudes of your organization towards cloud computing have altered. The concerns raised by the software craftsmanship movement of maintainability, testability, extensibility are amplified in such an environment which is years from settling down. The Azure combination of a well established platform in the .NET ecosystem and some new technologies, approaches and thinking thrown in means that we have both the need and the frameworks to craft solutions properly to reduce the risk that we are exposed to. Testability, inversion of control, loose coupling and other software craftsmanship techniques are well supported, understood and debated on the .NET platform and are therefore (reasonably) portable onto Azure. You need to hone these skills as single layered, monolithic architectures that seem easy at first and are encouraged by Microsoft marketing and tooling will result in an approach with high and unnecessary risk in an already risky space. DEVELOPER RESPONSIBILITY While technologists may be excited at the technical opportunities of cloud computing, business and other decision makers are probably more wary of the cloud than any other (recent) computing technology shift. They are reading conflicting messages by vendor marketers and self proclaimed cloud experts while their own staff are both protecting existing jobs and whispering discord in the passageways. So while risk management and selling of architectures may not be amongst the most exercised developer skills, cloud computing requires that we take cloud computing to the business and take some responsibility for allaying fears.

29

The Windows Azure Platform: Articles from the Trenches TRIALS & TRIBULATION S OF WORKING WITH AZ URE WHEN THERES MORE THAN ONE OF YOU By Grace Mollison I had enormous fun working on an Azure project See the Difference that took 7 weeks from start of development to handing over to the client The technology stack used was: Windows Azure hosting, Windows Azure Storage, SQL Azure, ASP.Net MVC, N2CMS, Spark View Engine, Castle Windsor, xVal, PostSharp There was one bug bear in that the Azure development experience is NOT designed for a team of developers and I needed to get that sorted out. So where did I start? With a list of course. Here were the big ticket items: The ability to set up three environments Development, Testing and UAT. Testing and UAT to be accessible by all members of the team Shared access to the hosted environment Automated deployments to the cloud as part of a CI build. After all no self-respecting development team doesnt have a continuous integrated build do they?

DEVELOPMENT ENVIRONMENT For the development environment we stuck to Visual studio 2008 SP1. Visual Studio 2010 was in beta2/ RC when we undertook the development but with all the potential unknowns with Azure that was a step too far. The Azure developer tools were installed on each developer workstation and the Azure SDK on the build server. There was an upgrade to the Azure SDK during the development cycle which the development team said was needed which meant updating the various machines that constituted the environment manually ( Alas no WSUS ) . Fortunately this only happened once during the development cycle. In addition to Visual studio we also supplemented the development environments with a few extra tools that provided a more complete development experience. TEST ENVIRONMENT The Test environment proved to be more challenging. The most pragmatic way to sort it out was to provision another development work station running the development fabric. But (yes I know theres always a but) the Development fabric runs against the local loopback address. To get round this a SSH tunnel had to be set up between the target machine and the Client machines that needed to access it. Alas this proved to be slightly less than user friendly plus the fact the random allocation of ports for the local storage fabric had to be resolved after each new deployment made it basically unworkable. The differences between the Development fabric and Azure fabric was also impacting the team deliverables as we ended up seeing differences in behavior or could only test certain functionality in the staging environment. We resorted to using Azure Staging as our Test environment.

30

The Windows Azure Platform: Articles from the Trenches I was anticipating an easy ride from here on but.... yes its another of those Buts. CERTIFICATES The team members needed to either use their own self signed certificate or to use a certificate I generated which is then uploaded onto Azure. As the team was small and fluid the decision went with using one I generated. This turned out to be a good call as we did have problems with certificate connections apparently timing out after some time for some team members for no obvious reason. Because there was only one certificate to worry about it was relatively painless to resolve the problems around the use of this. It is bad practise to share certificates in this way but pragmatism was the order of the day. For a larger team with a longer development cycle I would advocate each developer using a personal certificate which can then be easily revoked. One thing we quickly learnt was that in the early days of development, suspending, then deleting was the safest approach to deploying a new package. The small team meant it was easy to communicate the change of URL this caused. WHEN THINGS GO WRONG Its a fairly nerve racking experience when things go wrong as often you can do nothing but wait for Azure to barf and throw a Dr Watson and theres no real feedback when Azure tries to spin up the roles. Alas as soon as we got to UAT we then had to give up our staging environment and minimise changes to the Staging URL as both the client and a 3rd party needed to know the URL. The loss of this environment for system testing meant we were forced to press my personal Azure account into service as the Staging environment. We did get the automated deployment in place but its a tale too long to describe in this article. SUMMARY The Windows Azure Platform may not be quite ready for team development out of the box but once you understand what needs to be addressed the barriers for team development are easily overcome . You can with a small amount of work up front treat development for the Windows Azure Platform as you would any other application developed using your familiar team development tools.

31

The Windows Azure Platform: Articles from the Trenches USING A CONTINUOUS INTEGRATION BUILD TO ACHIEVE AN AUTOMATED DEPLOYMENT OF YOUR LATEST BUILD By Grace Mollison This article assumes familiarity with Team Foundation build and MSBuild concepts such as tasks and properties Setting up a Continuous Integration (CI) build to automatically push a successfully built package directly to Azure cannot be achieved straight out the box but requires some additional work. This article outlines an approach taken whilst delivering the See the Difference project using the Windows Azure Platform. GETTING THE RIGHT BITS The first thing that was done was to collate and configure the components that would be needed to allow the build server to access the Target Window Azure portal via a command line. To do this requires using the Azure Service Management API. Using the API requires an x.509 certificate. I created a self-signed one using the makecert tool which is part of the windows SDK. An example on how to do this is shown below:"c:\Program Files\Microsoft SDKs\Windows\v6.0A\bin\makecert" -r -pe -a sha1 -n "CN=Windows Azure Authentication Certificate" -ss My -len 2048 -sp "Microsoft Enhanced RSA and AES Cryptographic Provider" -sy 24 MySelfSignedCert.cer.

The blog post Creating and using Self Signed Certificates for use with Azure Service Management API explains in detail how to configure the certificate on the target Azure portal and the machine that needs to communicate with the portal. I downloaded the Windows Azure Service Management PowerShell CmdLets and also the Windows Azure Service Management API Tool which are both handy for remotely accessing the Azure portal via the Service Management API. At this stage I had no idea which one I would be using. I tried them both as part of a Build and found that I preferred using the service management API tool csmanage (despite being a big fan of Powershell). The blog post referred to above illustrates the use of the x.509 certificate, the API and Powershell to deploy to the Azure staging environment. PACKAGING FOR DEPLOYMENT Next I looked at packaging the application ready for deployment. There are two key things when packaging the application from the command line : 1. Obtain the role types and names as this will be needed to construct the package 2. Make sure the location of the service definition file is known The ServiceDefintion.csdef file contains the role types and names as this is needed to construct the package using the Windows Azure command line tool cspack. Below is a snippet from a ServiceDefintion.csdef file illustrating a simple example with one web role. The number of instances does not matter to cspack :

32

The Windows Azure Platform: Articles from the Trenches If cspack is not run from the correct place the package will not be constructed correctly hence why the location of the ServiceDefintion.csdef file is so important. DEPLOYING At this stage I was able to package the application and deploy to the Azure portal via MSBuild. We had concerns with this approach with regards problems with the actual package affecting the deployment. In particular we were concerned about what to do after handover to the client when a little more caution would be called for. A change of plan was decided upon. The new plan was to push the package to blob storage and then the Client would be able to carry out the deployment at their convenience. To push the package to blob storage a C# console application I called LoadBlob was written that could be called from the MSBuild script. This application pushed the package to a pre-determined container. It was decided that storing the configuration (.csfg) file in blob storage was also a good idea as it would reduce the risk of non production configuration settings being used. During testing I was unable to get the service management API to use the stored configuration file. It was only able to use one stored on the local system, but as the end to end deployment process we were implementing actually required a pause for breath before the push to Azure Staging or production this issue did not affect the implementation of the CI build process. Finally after testing all the constituent parts, they were incorporated as part of the CI build. Below is a snippet from a TFSbuild.proj file where I overrode the target AfteDropBuild. The AfterDropBuild task is called after dropping the built binaries and I used it to insert some commands to allow the build to use cspack ( equivalent to zipping the dlls and configuration files ) to package the cloud service package which is then pushed up to blob storage ready for deploying to staging or Production. c:\Program Files\Windows Azure SDK\v1.0\bin\cspack.exe "$(PathToAzureTools)" SeeTheDifference.Cloud.csx\ServiceDefinition.csdef /role:SeeTheDifference.Web;seeTheDifference.Cloud.csx\roles\SeeTheDifference.Web\approot;See TheDifference.Web.dll c:\TOOLS\AzureDeployment $(LoadblobPath)\Loadblob.exe

33

The Windows Azure Platform: Articles from the Trenches The screenshots below show the uploaded cskpg in blob storage:

The deployment could then be completed by using a user friendly tool like Cerebreta Cloud Storage Studio.

34

The Windows Azure Platform: Articles from the Trenches USING JAVA WITH THE WINDOWS AZURE PLATFORM By Rob Blackwell With a name like Windows Azure, you could be forgiven for thinking that Microsofts cloud computing offering is a Microsoft-only technology. In fact it has a lot to offer Java developers through its use of open standards and RESTful APIs. ACCESSING WINDOWS AZ URE STORAGE FROM JAVA WindowsAzure4J is an open source library that can be used to access Windows Azure Storage from Java applications, running on Windows Azure or elsewhere. Download the JAR file from http://www.windowsazure4j.org/ . Youll also need to grab some other dependencies commons-collections-3.2.1.jar commons-logging-1.1.1.jar dom4j-1.6.1.jar httpclient-4.0-beta2.jar httpcore-4.0.jar httpcore-nio-4.0.jar httpmime-4.0-beta2.jar jaxen-1.1.1.jar log4j-1.2.9.jar To get started, youll need an account name and account key from the Windows Azure portal. Paste these into the sample code provided with WindowsAzure4j to use Blobs, Queues or Tables.If you are an Eclipse user, you can also install the Windows Azure Tools for Eclipse http://www.windowsazure4e.org/

35

The Windows Azure Platform: Articles from the Trenches

The Windows Azure Storage Explore running in Eclipse.

RUNNING JAVA CODE ON WINDOWS AZURE If you want to host a Java application in Windows Azure, there are a number of considerations. The first thing to note is that even if your Java application is a Web application you probably wont want to use an Azure Web Role. The principle difference between web roles and worker roles is whether Internet Information Services (IIS) is included. Most Java developers will want to use a Java specific web server or framework, so its usually best to go with a worker role and include your choice of web server within your deployment package. Youll also need to bootstrap Java from a small .NET program that will essentially invoke the Java runtime through a Process.Start call. Both web roles and worker roles are provisioned behind a load-balancer so either is suitable for hosting web applications. In a worker role you just have to do some additional plumbing to connect up your web server to the appropriate load-balanced Input End Point. So for example, the public facing port 80 of yourapp.cloudapp.net might get mapped to, say port 5100 in your worker role. The following code allows you to determine this port at runtime:RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["Http"].IPEndpoint.Po rt

36

The Windows Azure Platform: Articles from the Trenches Fortunately both The Tomcat Solution Accelerator and AzureRunMe handle all of these technicalities for you. The Tomcat Solution Accelerator is a good choice if you have a traditional Java based web application. It supports Java Servlet and Java Server Pages applications, possibly packaged as a WAR file. It can be downloaded from http://code.msdn.microsoft.com/winazuretomcat . The accelerator walks you through the process of creating an Azure cloud services package file that contains your application as well as the Tomcat server and Java Runtime. It automatically handles the necessary configuration. Just upload the resulting cspkg file to Windows Azure, wait for it to deploy then bring up your web browser and browse to http://yourapp.cloudapp.com AZURERUNME AzureRunme (http://azurerunme.codeplex.com/) doesnt assume any particular web server or framework. In fact you could just run a straightforward command line application with no visible user interface. That said, Ive used it successfully with both Restlet (http://www.restlet.org/) and Jetty (http://jetty.codehaus.org/jetty/ ). Imagine that you were going to run your application from a USB drive and that you werent allowed to install any software onto the machine youd have to include the Java Runtime Executive (JRE) , all the library JAR files and any data all in subdirectories of the USB stick. Youd probably create a .BAT file at the top level to run everything. Like this: cd MyApp ..\jre\bin\java -cp MyApp.jar;lib\* Start %1 AzureRunme takes a similar approach put all these files together in a single ZIP file, upload it to Blob storage. Download AzureRunMe cspkg file and use this to bootstrap your Java code. Notice that the batch file takes a parameter %1 This is the port that you should use if you want to bring up a web server the load balancer will direct all HTTP traffic to your application on this port. AzureRunme comes with a Trace listener that uses the Service Bus to relay standard output and any log4j messages back to a command window on your desktop machine. It makes it easy to see trace messages, watch your applications progress and see any exception messages.

37

The Windows Azure Platform: Articles from the Trenches

AzureRunMe Trace Listener showing log messages relayed via the AppFabric Service Bus.

For more information about Interoperability on the Microsoft platform see http://www.interoperabilitybridges.com/

38

The Windows Azure Platform: Articles from the Trenches CHAPTER 3: WINDOWS AZURE AUTO-SCALING WINDOWS AZURE COMPUTE INSTANCES By Steven Nagy INTRODUCTION There are many reasons applications need to scale. Some applications have on/off periods of batch processing (for example overnight render farms), some have predictable peak loads (for example share market applications peak during open and close of the market) and some might have unpredictable peak periods (for example your website gets linked by Slashdot). In the case of predictable peak loads we can easily log in to the Windows Azure portal and adjust our configuration file to increase the number of instances of our web and worker roles. However, when application load peaks unexpectedly, we want our applications to respond immediately. For applications with global reach, this might be when we least expect it. Without appropriate monitoring techniques we may not even know the extent to which we are failing to serve requests. On the flip side, we are paying for every CPU core hour we use. Thus we want to be able to scale down instances that are underutilised. We need to know how to auto-scale; our applications need to become smart. A BASIC APPROACH There are a number of jigsaw pieces that need to fit together to build the auto-scaling picture. The first piece is monitoring, which lets us pull information from the roles that need to auto-scale. The next piece is about establishing rules and measuring against thresholds to determine when to scale up and scale down. The third piece establishes trust between the service that is doing the monitoring (referred to from here on as the Scale Agent), and the roles that are being monitored. Finally, the Scale Agent needs to instruct the Windows Azure Portal to add or remove instances of those roles as it deems necessary. Monitoring Rules Scale Agent Trust Instruct

THE SCALE AGENT The Scale Agent is responsible for monitoring your application, applying rules and instructing the API to scale your roles, and can be hosted in different ways. One option is hosting the agent as another process on your existing Azure roles, but a role can have many identical instances, so which instance would it run on? And the agent will take some CPU resources, could that impact on its ability to

39

The Windows Azure Platform: Articles from the Trenches assess the other work running on the same role? It makes more sense to move the Scale Agent to a separate location that doesnt interfere with the standard workload, where its own workload wont pollute the statistics. The agent can be hosted as another worker role, separate to the main work being done by the application. This worker role would never need to scale, and could be geo-located and co-located near the compute instances that it needs to monitor. This removes external bandwidth costs and allows for faster processing/assessment. You could also host the agent off site completely, perhaps in your own data centre, as a windows service. This means you have more control over the agent, but the agent will be slightly slower communicating to the instances, getting performance counter logs, and issuing scale commands. A dedicated worker role is usually the best option but also the hardest to configure for trust as well see further on. MONITORING: RETRIEVING DIAGNOSTIC INFORMATION Before we can make decisions about scaling, we need to know some simple statistics about the services we want to scale. These statistics in turn let us make informed decisions. Diagnostic helper processes will put performance counter information into table and blob storage, so this will require an Azure Storage project. There are lots of counters to choose from, but we usually want to monitor memory usage, CPU usage, and number of requests per second, and if any one of those exceeds an upper threshold then we want to scale up. The role that needs to auto-scale will be responsible for gathering its own performance information and dumping it into a storage table. This is done via configuration classes, available in Microsoft.WindowsAzure.Diagnostics namespace:

var perfConfig = new PerformanceCounterConfiguration(); perfConfig.CounterSpecifier = @"\Processor(0)\% Processor Time"; perfConfig.SampleRate = TimeSpan.FromSeconds(5);

We create a configuration item for a performance counter we want to track in this example we want information about CPU utilisation. The average utilisation will be gathered over 5 second intervals.var config = DiagnosticMonitor.GetDefaultInitialConfiguration(); config.PerformanceCounters.DataSources.Add(perfConfig); config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);

DiagnosticMonitor.Start("DiagnosticsConnectionString", config);

40

The Windows Azure Platform: Articles from the Trenches We then add the performance counter to the list of items we want the DiagnosticMonitor to track for us. The DiagnosticMonitor runs in a separate process on the virtual machine instance so it wont interfere with our normal application code. Every minute new performance counter information will be written back to a storage account as specified in the DiagnosticsConnectionString, into a table called WADPerformanceCountersTable. We can verify the counter information made it into the table using 3rd party tools

You can see that the table has an entity which has a property called CounterValue which contains our CPU utilisation. I wont go into the code required to view an entity in table storage; this is very well documented already11. Your Scale Agent will retrieve these values by polling the table occasionally and keeping track of the utilisation, scaling when needed. RULES: ESTABLISHING WHEN TO SCALE

11

http://blogs.msdn.com/jnak/archive/2010/01/06/walkthrough-windows-azure-table-storage-nov-2009-and-later.aspx

41

The Windows Azure Platform: Articles from the Trenches The Scale Agent now knows what levels your various role instances are at based on the performance counter information. However deciding when to scale up/down is difficult and can easily become an exercise in advanced mathematics. Although the rules are different for every application, here are some common issues to consider: You usually need a certain amount of head room, in case you get a sudden spike in load before your Scale Agent can spin up more instances Immediately after scaling up, your original instances might still be over the threshold prevent your agent from scaling up again immediately until enough time has passed that you can be positive that more scale is needed Aggregate your usage from all instances if a single instance is spiking but the rest are under normal load, you dont really need to scale If you do need more instances, scale up based on how many instances you currently have. For example, if you only have 5 instances, you might want to add 2 more (40% increase) before checking again. If you have 50 you may only want to add 10 (20% increase) Try to predict load based on patterns of behaviour. For instance, if over the last 15 minutes youve been steadily climbing by 5% utilisation per minute, you can predict that you will probably go over your threshold in X number of minutes. Why wait until you are over loaded and losing connections before scaling? Analysing these kinds of patterns can let you scale up just in time Predictive patterns can get very complicated if at 4pm every day you seem to have additional load, prepare in advance for scale rather than waiting for auto-scale to kick in Keep in mind that long running requests can provide false positives if all web threads are used for an instance but all those threads are held up in IO requests, you will still have low CPU utilisation, so consider a range of performance counters specific to your type of application and architecture Hard limits If your average is 3 instances, would you want your application to be allowed to auto-scale up to 500 instances? Thats probably not a credit card bill you want to receive, so consider imposing some hard limits to scale, or provide some reasonable alerting (SMS, email, etc) so that if your app DOES scale to 500, you can find out immediately and hop online to see why

TRUST: AUTHORISING FOR SCALE There is a rich management API that can be used to control your Windows Azure projects, however in order to issue commands there needs to be trust between the Scale Agent and the API of the account hosting the roles this trust is established via X509 certificates. Generating certificates is also well documented. Once created, we need to provide our certificate in 3 places: The Windows Azure Account for the Service Management API to check requests against The virtual machine issuing commands in our case, where the Scale Agent is hosted The service configuration and definition for our Scale Agent project

42

The Windows Azure Platform: Articles from the Trenches

In the Windows Azure portal for the account you wish to manage, there is an Account tab where you can upload DER encoded certificates with a .CER extension:

You must also upload the certificate in the Personal Information Exchange format with a .PFX extension and the matching password to your service project so that the certificate becomes available to any virtual machine instance provisioned from that entire project. This can be found under the Certificates section of your service deployment:

Click on Manage and upload the .PFX version of your certificate. It is important to note that this is not installing the certificate to the role instances under this service. Instead it is making the certificate available to any role that requests it. To make that request we have to complete the third step and tell our Scale Agent role that it will require that certificate.

43

The Windows Azure Platform: Articles from the Trenches While it is possible to enter the required XML manually, it is much easier to use the property pages instead. For the role that needs the certificate (i.e. your Scale Agent role) find it in your Cloud Service project, right click and select properties. In the property pages, find the Certificates tab on the left. Select Add Certificate from the top and enter the details. The important part here is finding your certificate under the right Store Location and Name. This screen presumes the certificate is installed locally as it uses local machine stores to search for it. If you dont have it installed locally, you can just paste in the thumbprint manually.

That wraps up all 3 parts of the certificate process. When your role is deployed to Windows Azure, it will ask for the certificate with that thumbprint to be installed into the virtual machine. SCALING THE SERVICE MANAGEMENT API We know we need to scale, we have established trust, all we need to do is issue the command: scale! All API calls are RESTful, but there is no API that exists solely for scaling up and down. Instead this is done through the service configuration file, which is maintained separately from the service deployment. You can at any time go and change the configuration for your deployment through the portal, and the API is just an extension of this functionality. The steps required are: 1. 2. 3. 4. Request the configuration file for a service deployment Find the XML element for the instance count on the role you are scaling Make the change Post the configuration file back to the service API

If you dont want to manually manipulate the REST API yourself, Microsoft has posted code samples to assist you, including samples on scale12 and services management API 13.

12 13

http://code.msdn.microsoft.com/azurescale http://code.msdn.microsoft.com/windowsazuresamples

44

The Windows Azure Platform: Articles from the Trenches SUMMARY This short article provides you with the theory to scale up your applications reactively. Scheduled scale up/down can also be automated with the same technique defined above but instead of scaling reactively, you can also scale proactively. While this article has presented just one way of scaling automatically, there are other derivatives and approaches you could follow. For example, the Scale Agent could pull diagnostic information from the roles via the Diagnostic Manager classes, rather than the roles pushing that information. Open source framework Lokad.Cloud14 takes another approach by allowing roles to auto-scale themselves. Find the approach thats right for you and capitalise on economies of scale today!

14

http://code.google.com/p/lokad-cloud/

45

The Windows Azure Platform: Articles from the Trenches BUILDING A CONTENT-BASED ROUTER SERVICE ON WINDOWS AZURE By Josh Tucholski Some applications, depending on their nature, require priority processing based on request content. It is typical in these scenarios to develop an application layer to route requests from the client to a specific business component for further processing. Implementing this in Windows Azure is not straightforward due to its built-in load balancer. The Windows Azure load balancer only exposes a single external endpoint that clients interact with; therefore it is necessary to know the unique IP address of the instance that will be performing the work. IP addresses are discoverable via the Windows Azure API when marked as internal (configured through the web roles properties).

While this tutorial may seem more of an exercise on WCF than on Windows Azure, it is important to understand how to perform inter-role communication without the use of queues. In order to filter requests by content, an internal LoadBalancer class is created. This class ensures requests are routed to live endpoints and not dead nodes. The LoadBalancer will need to account for endpoint failure and guarantee graceful recovery by refreshing its routing table and passing requests to other nodes capable of processing. Following is the class definition for the LoadBalancer to detect endpoints and recover from unexpected failures that occur.public class LoadBalancer { public LoadBalancer() { if (IsRoutingTableOutOfDate()) { RefreshRoutingTable(); } } private bool IsRoutingTableOutOfDate() { //Retrieve all of the instances of the Worker Role var roleInstances = RoleEnvironment.Roles["WorkerName"].Instances; //Check current amount of instances and confirm sync with the LoadBalancers //record if (roleInstances.Count() != CurrentRouters.Count()) { return true;

46

The Windows Azure Platform: Articles from the Trenches} foreach (RoleInstance roleInstance in roleInstances) { var endpoint = roleInstance.InstanceEndpoints["WorkerEndpoint"]; var ipAddress = endpoint.IPEndpoint; if (!IsEndpointRegistered(ipAddress)) { return true; } } return false; } private void RefreshRoutingTable() { var currentInstances = RoleEnvironment.Roles["WorkerName"].Instances; RemoveStaleEndpoints(currentInstances); AddMissingEndpoints(currentInstances); } private void AddMissingEndpoints(ReadOnlyCollection currentInstances) { foreach (var instance in currentInstances) { if (!IsEndpointRegistered(instance.InstanceEndpoints["WorkerEndpoint"].IPEndpoint )) { //add to the collection of endpoints the LoadBalancer is aware of } } } private void RemoveStaleEndpoints(ReadOnlyCollection currentInstances) { //reverse-loop so we can remove from the collection as we iterate for (int index = CurrentRouters.Count() - 1; index >= 0; index--) { bool found = false; foreach (var instance in currentInstances) { //determine if IP address already exists set found to true } if (!found) { //remove from collection of endpoints LoadBalancer is aware of } } } private bool IsEndpointRegistered(IPEndpoint ipEndpoint) { foreach (var routerEndpoint in CurrentRouters) { if (routerEndpoint.IpAddress == ipEndpoint.ToString()) { return true; } } return false; } public string GetWorkerIPAddressForContent(string contentId) { //Custom logic to determine an IP Address from one of the CurrentRouters //that the load balancer is aware of }

47

The Windows Azure Platform: Articles from the Trenches

}

The LoadBalancer is capable of auto-detecting endpoints and the remaining work for the router service is WCF. A router, by definition, must be capable of accepting and forwarding any inbound request. The IRouterServiceContract will accept all requests with the base-level message class and handle and reply to all actions. Its interface is as follows:[ServiceContract(Namespace = "http://www.namespace.com/ns/2/2009", Name = "RouterServiceContract")] public partial interface IRouterServiceContract { [OperationContract(Action = "*", ReplyAction = "*")] Message ProcessMessage(Message requestMessage); }

The implementation of the IRouterServiceContract will use the MessageBuffer class to create a copy of the request message for further inspection (e.g. who the sender is or determining if there is a priority associated with it). GetWorkerIPAddressForContent on the LoadBalancer is invoked and a target endpoint is requested. Once the router has an endpoint, a ChannelFactory is initialized to create a connection to the endpoint and the generic ProcessMessage method is invoked. Ultimately the endpoint that the router forwards requests to will have a detailed service contract capable of completing the message processing.public partial class RouterService : IRouterServiceContract { private readonly LoadBalancer loadBalancer; public RouterService() { loadBalancer = new LoadBalancer(); } public Message ProcessMessage(Message requestMessage) { //Create a MessageBuffer to attain a copy of the request message for inspection string ipAddress = loadBalancer.GetWorkerIPAddressForContent("content"); string serviceAddress = String.Format("http://{0}/Endpoint.svc/EndpointBasic", ipAddress); using (var factory = new ChannelFactory(new BasicHttpBinding("binding"))) { IRouterServiceContract proxy = factory.CreateChannel(new EndpointAddress(serviceAddress)); using (proxy as IDisposable) { return proxy.ProcessMessage(requestMessageCopy); } } } }

Detecting and ensuring that the endpoints are active is half the battle. The other half is determining what partitioning scheme effectively works when filtering requests to the correct endpoint. You may decide to implement some way of consistently ensuring a clients requests are processed by the same back-end component or route based on message priority. The approach outlined above also attempts to accommodate for any disaster-related scenarios so that an uninterrupted experience can be provided to the client. If one of the back-end components happens to shut down due to a hardware failure, the load balancer implementation will ensure that there is another endpoint available for processing.

48

The Windows Azure Platform: Articles from the Trenches BING MAPS TILE SERVERS USING AZURE BLOB STORAGE By Steve Towler Back in early 2009, I was assigned to a project where I was required to build an informational mapping solution for a customers website. This mapping solution served custom tiles of the UK which were specially commissioned for the project. Although the map only covered the UK and we had restricted the zoom levels between 6 and 11, each set of tiles (and there were twelve sets) had around 4500 tiles and averaged 80 megabytes in size. Less than 1 gigabyte of tiles may seem like a trivial figure in terms of the vast amounts of storage we have at our disposal nowadays. But what if things had been different? What if the customer wanted to cover Europe or even more zoom levels? What would be the bandwidth implications and the potential costs associated with huge demand for the map? With Windows Azure now live, had the same project landed on my desk today I would be looking to serve the map tiles differently as Blob storage is ideally suited to such a task. Storage is infinitely scalable, cheap and its RESTful interface makes requesting the tiles clean and simple. Setting up a Bing Map tile server using Windows Azure Blob storage is surprisingly easy and you can have your own tile server up and running in a few small steps. First things first, you need to crunch your tiles. This is the process whereby you take you custom map images and cut them up into tiles, ready to be used within your mapping application. There are plenty of tutorials on how to do this out on the web and Microsoft MapCruncher is a preferred tool for carrying this task out. Now that you have your crunched tiles and you have saved them off to a directory on your local machine, the next step is to get your tiles up into the cloud. For ease I am going to use CloudXplorer, one of the many Windows Azure storage management tools available on the web Using CloudXplorer, create a public container in blob storage called tiles. Now copy all of your crunched tiles from your local machine up to your newly created container in blob storage.

49

The Windows Azure Platform: Articles from the Trenches

Once complete, your tiles s