29
Windows Server AppFabric Architecture Guide Emil Velinov, Michael McKeown September 2010

App Fabric Arch Guide

Embed Size (px)

Citation preview

Windows Server AppFabric Architecture Guide

Emil Velinov, Michael McKeown

September 2010

Copyright

This document is provided “as-is”. Information and views expressed in this document, including

URL and other Internet Web site references, may change without notice. You bear the risk of

using it.

This document does not provide you with any legal rights to any intellectual property in any

Microsoft product. You may copy and use this document for your internal, reference purposes.

© 2010 Microsoft. All rights reserved.

Contents

Windows Server AppFabric Architecture Guide..............................................................................5

Introduction.................................................................................................................................. 5

Who Should Read This Guide?...................................................................................................6

What's in This Guide?.................................................................................................................. 6

Windows Server AppFabric Key Architectural Components............................................................6

AppFabric Architectural Components..........................................................................................7

Service Hosting............................................................................................................................... 7

AppFabric Windows Services.........................................................................................................8

AppFabric Event Collection Service................................................................................................8

AppFabric Workflow Management Service.....................................................................................8

Data Storage................................................................................................................................... 9

Security Model................................................................................................................................ 9

Enterprise-Scale AppFabric Architecture and Deployment Topologies.........................................10

Service Hosting Tier...................................................................................................................... 10

Scaling Out to an AppFabric Web Farm........................................................................................11

Scale-Out Performance.............................................................................................................11

Hosting Node Configuration..........................................................................................................13

Distributed Transaction Coordinator (DTC) Configuration.........................................................13

AppFabric Windows Services....................................................................................................14

IIS Application Pools.................................................................................................................. 15

Service Deployment Options and Considerations.....................................................................15

Data Storage Tier.......................................................................................................................... 17

SQL Server Platform Optimizations..............................................................................................17

AppFabric Data Stores..................................................................................................................19

Multiple Persistence Stores.......................................................................................................19

Multiple Monitoring Stores.........................................................................................................21

Multiple SQL Server Instances on the Same Server or Cluster....................................................24

Multiple SQL Server Servers or Clusters......................................................................................25

Summary...................................................................................................................................... 25

Windows Server AppFabric Architecture Guide

The purpose of this guide is to provide guidance on how to optimize the architecture of a

Windows Server AppFabric system. It does not discuss the topic of AppFabric Caching. This

guide is meant to complement the existing documentation on the features of AppFabric. It will

reference that documentation to provide additional technical depth.

IntroductionAppFabric extends Windows Server to provide enhanced hosting, management, and caching

capabilities for web applications and middle-tier services. The AppFabric hosting features add

service management extensions to Internet Information Services (IIS), Windows Process

Activation service (WAS), and the .NET Framework version 4. These features include Hosting

Services and Hosting Administration tools that make it easier to deploy, configure, and manage

Windows Communication Foundation (WCF) and Windows Workflow Foundation (WF)-based

services.

Services and service-oriented architectures exist in many types of applications. Modern

applications typically have a data-driven transactional component (such as taking orders on a

website) together with highly distributed business logic that manages these transactions across a

middle tier. Developers are increasingly tasked with requirements to deliver highly responsive and

scalable applications. This is true not only for middle-tier services, but also for web, mobile, and

desktop applications. As demands on applications increase (for example, a website becomes

popular, or other groups start consuming your shared service), expensive data access can often

present serious limitations to application performance and scale. A solid architecture is critical to

ensure that you develop a useful distributed AppFabric solution.

Today architecture is a commonly overused term.  However, within its many diverse definitions

there are some common core concepts that define an architecture as follows:

A group of key choices about how a software system is composed

The fundamental association of system components through their relationships and interfaces

The principles surrounding the solution creation and ongoing evolution

These concepts apply here as we provide specific recommendations on how to properly design

an AppFabric deployment for enterprise-scale load requirements.  We will start by providing a

high-level understanding of the primary components of AppFabric. These are hosting of .NET

Framework WCF and WF services, the AppFabric Windows services (Event Collection service

and Workflow Management service), and storage for persistence and monitoring data. 

Following the high-level introduction, we will look at specific issues that influence your AppFabric

architecture. It is important to understand how to conceptually scale out an AppFabric installation

at the service and database tiers.  These issues include concepts surrounding configuration of

5

the hosting environment, the AppFabric Windows services, and deployment of .NET Framework

WCF and WF services to an AppFabric web farm.  The configuration of the persistence and

monitoring data plays an important role in ensuring that your architecture is able to scale correctly

over time. We will look at optimizations and different server and cluster configurations at the data-

tier level.

Who Should Read This Guide? IT professionals whose job is to administer, configure, and deploy AppFabric servers.

IT architects whose job is to develop solution architectures using AppFabric servers as a part of the design.

What's in This Guide? Windows Server AppFabric Key Architectural Components

This section provides background about service hosting, the AppFabric Windows services

(Event Collection service and Workflow Management service), the data storage tier, and the

AppFabric security model.

Enterprise-Scale AppFabric Architecture and Deployment Topologies

This section provides architectural guidance on optimizing, deploying, and configuring both

the AppFabric service and the data tiers.

Windows Server AppFabric Key Architectural Components

Windows Server AppFabric is an evolution of the Application Server role in Windows Server. It

works with the .NET Framework version 4 to provide capabilities such as monitoring, persistence,

security, and hosting of Windows Communication Foundation (WCF) and Windows Workflow

Foundation (WF) services using Windows Process Activation Service (WAS) and Internet

Information Services (IIS).

6

For more information on the architecture of AppFabric, refer to Architecture Diagram.

In this section we will take a high-level view of the AppFabric architecture.  The goal is to give you

a basic understanding of the concepts that we will later discuss in depth with recommendations.  

We will divide an AppFabric architecture into four sections: service hosting, Windows services,

data storage, and security.

AppFabric Architectural ComponentsService Hosting

AppFabric Windows Services

Data Storage

Security Model

Service Hosting

The main focus of Windows Server AppFabric is the hosting of .NET Framework 4 WCF and WF

services.  AppFabric leverages the hosting capabilities of Windows Process Activation service

(WAS) to host its configured .NET Framework services.  It manages workflow services by using

the Workflow Management service, which provides auto-start, durable timers, and command

queue functionality.  The tooling for services hosted in AppFabric allows you to monitor your

applications and to manage security, auto-start activation, performance, and service endpoints.

The AppFabric hosting services provide hierarchical management using inheritance of

configuration files, so you don’t need to access the files directly.   For more information on hosting

7

of services in AppFabric, see Hosting, Hosting Concepts, and Windows Process Activation

Service (WAS).

AppFabric Windows Services

Windows Server AppFabric installs two Windows services.

Windows service Description

AppFabric Event Collection Service Collects Event Tracing for Windows (ETW)

events raised by WCF and WF services and

stores them in the monitoring store.

AppFabric Workflow Management Service Manages the activation of durable workflow

service instances in an instance store, and the

execution of user control commands.

For detailed information on these two AppFabric Windows services, refer to Event Collection

Service and Workflow Management Service.

AppFabric Event Collection Service

The first AppFabric Windows service is the Event Collection service. Up to ten Event Collection

service instances can run on a single server. Based upon the configured monitoring level, its role

is to gather WCF and WF instrumentation events emitted by the .NET Framework runtime into an

Event Tracing for Windows (ETW) session.  The Event Collection service then stores these

events in the monitoring database. AppFabric uses event data to aggregate information about

overall status of applications to assist in performance monitoring and troubleshooting. 

AppFabric Workflow Management Service

The second AppFabric Windows service is the Workflow Management service, which the hosting

services use to manage instances of workflows.  The Workflow Management service activates a

workflow service instance in an instance store when the instance is eligible to be activated.  The

Workflow Management service retrieves commands from a message queue that is written to by

an instance control provider, executes the commands, and then deletes commands from the

queue if the command execution is successful.

8

Data Storage

Windows Server AppFabric uses two data storage entities. The first entity is the persistence

store, which allows workflows to persist instance state and metadata information.   The second

entity is the monitoring store, which keeps track of WCF and WF events emitted by the .NET

Framework runtime for use in monitoring applications. Both stores can reside in numerous

implementations, such as databases, flat files, or memory. In this document we will discuss the

default SQL Server database implementation and refer to these stores as the persistence and

monitoring databases.

Security Model

You must protect your .NET Framework version 4 applications managed by Windows Server

AppFabric so that you allow users to access only the services and data for which they have

authorization. To do this, you must identify users, verify that they are who they say they are, and

determine if they have permission to view the information or to perform the task requested. The

message exchange between client and server must take place over a secure channel to ensure

private information transfer. The Microsoft technologies that support AppFabric provide integrated

services that enable companies to securely connect to and use your applications. AppFabric

administrators do not need to maintain multiple sets of user databases, and all of the services for

literally hundreds of intranet servers can easily be managed from a single graphical tool. The

integration of the Microsoft security technologies and products that support AppFabric enables

you to assign users access to all resources needed to run their applications.

The primary goal of the AppFabric security model is to provide a simple, yet effective, mechanism

for the majority of AppFabric users. Because of its integration with existing Windows, .NET

Framework 4, IIS, and SQL Server security concepts, users can leverage existing security

knowledge and skill sets to use the AppFabric security model. Because AppFabric adds only

minor enhancements to an already robust integrated Microsoft security picture, its security model

is familiar to administrators who are knowledgeable about Microsoft security concepts. This

results in a simplified and integrated model, and lower long-term total cost of ownership for

AppFabric customers. If you are already familiar with these products and technologies, you can

easily secure your application by following the guidance in the Security and Protection section of

the AppFabric documentation.

You use three conceptual AppFabric roles when designing your security solution — Application

Server Observers, Application Server Administrators, and Application Server Users. You assign

the appropriate users and permissions as directed in the AppFabric documentation to their

corresponding Windows NT groups and accounts, IIS application pools, and SQL Server logins

and database roles. For more information and important security guidance about using AppFabric

roles, the security permissions they have, and how they map to Windows security groups and

SQL Server database roles, see Windows Security, IIS and .NET Framework Security, and SQL

Server Security.

9

Enterprise-Scale AppFabric Architecture and Deployment Topologies

When deploying Windows Server AppFabric to support common enterprise-level requirements,

consider the following factors:

Scale-out for handling increased loads and providing high availability

Hosting node configuration and service deployment topologies for best performance and manageability

Optimized data storage configuration for fastest disk I/O to enable high-performance persistence and monitoring in AppFabric

We use the term “scale-out” throughout this document. It refers to the process of adding

more nodes to a system, such as adding a new computer to a distributed software

application. An example might be scaling out from one AppFabric server to a web farm of

two or more AppFabric servers. This is in contrast to the term “scale-up”, which refers to

adding resources to a single node in a system, typically involving the addition of memory

or processors.

This document provides guidelines for each of these areas across the two primary tiers of the

technology – the Service Hosting Tier and the Data Storage Tier.

Service Hosting Tier

The service hosting tier (or middle tier) consists of all Windows Server AppFabric runtime

components responsible for hosting and executing services. The key building blocks of this tier

are the IIS Windows Process Activation service (WAS) and the AppFabric Windows services —

Event Collection service and Workflow Management service. We will give architectural

recommendations for using an AppFabric web farm to scale out processing capabilities for

stateless versus stateful .NET Framework workflow services hosted in AppFabric. The Microsoft

Distributed Transaction Coordinator (DTC) can be configured on the hosting node to prevent

bottlenecks. The AppFabric Windows services can be optimized based upon their respective

functionalities. And the architecture used to physically group services in AppFabric applications

affects numerous quality attributes, such as scalability and reliability.

We will examine all of these issues in the following sections:

Scaling Out to an AppFabric Web Farm

Hosting Node Configuration

Note

10

Scaling Out to an AppFabric Web Farm

A Windows Server AppFabric web farm has the AppFabric features installed on each of its server

nodes. This enables you to handle increased load by sharing the resources of multiple servers.

We recommend an AppFabric web farm for any production environment that has to meet the

requirements of increased load, high availability, monitoring, and improved manageability.

Conceptually, the general approach to scaling out the AppFabric service hosting tier is the same

as creating a web farm, as shown in the following figure.

The AppFabric Web Farm Guide provides detailed steps for installing AppFabric in a multi-server

web farm environment.

Scale-Out PerformanceStateless WCF and WF services do not persist state of any kind between instantiations. Scale-out

performance tests at Microsoft have shown that for stateless WCF and WF services AppFabric

scales out linearly. This means that by doubling the number of nodes on an AppFabric web

farm, the throughput doubles as well, while maintaining constant service response times

(latency). The following charts show the scale-out characteristics.

11

In the preceding charts, the scale and measures of the Y-axis will vary based on the logic and

implementation of each service.

For the Throughput chart the near-linear throughput increases under scale-out. At a specific

point during the step-load test, the throughput doubles by going from one to two, to four, and

finally to eight servers. At the same time, the latency remains the same.

However, in the Latency chart for durable WF services storing state by using the AppFabric

persistence store, the scale-out is non-linear. The exact characteristics will depend on many

factors, such as the number of persistence points, amount and complexity of correlations, and

size of the workflow state. This will also vary between different workflow implementations. To

12

best understand the scale-out capabilities of a specific workflow service, we highly recommend

that you perform a scale-out lab using the actual workflow service implementation.

The results of performance tests show that a durable WF service running on an eight-node

AppFabric web farm creates contention through disk I/O and database locks in the persistence

database. For stateless, or non-durable, workflows this contention does not occur. Contention

becomes a limiting factor when attempting to further increase throughput by scaling out the

middle tier. The following chart shows the scale-out pattern of a persistent workflow service (red

bars) compared to a functionally equivalent, yet non-persistent, workflow service (blue bars).

The Data Storage Tier section of this document provides guidelines and techniques to alleviate

the persistence impact on throughput.

Hosting Node Configuration

Excluding the shared Windows Server AppFabric databases, each hosting node in an AppFabric

web farm is a fully capable stand-alone hosting server. Ultimately, the configuration and

performance of each individual node determines the overall performance and manageability of an

AppFabric web farm.

Distributed Transaction Coordinator (DTC) ConfigurationWhen using the DTC for transactional support, the default settings for its logging feature may not

be adequate to support the required throughput and performance at a certain threshold. By

default, DTC records its activities in a 4-MB log file located in the C:\Windows\system32\MSDtc

13

folder. Under heavy load, the size and location of the log file can become a bottleneck.

Therefore, as a best practice, we recommend that you:

Adjust the log file size, allowing 1 MB of log space for every 1,000 concurrent DTC transactions. You can estimate the transaction rate and calculate the log file size from that value. Alternatively, you can monitor the throughput by using the Component Services console and adjust the log file size accordingly.

More important than the size of the log file is the location where it is stored. It is best to locate the log file on a physical drive that is separate from the drive containing the operating system.

These settings are managed through the Component Services console shown in the following

figure.

For more information on managing the DTC log file size and location, refer to Managing Log Files

for Distributed Transactions.

AppFabric Windows ServicesIn cases where a large number of tracked events (more than 10,000-15,000 per second) are

emitted by each node in the AppFabric web farm, we recommend that you register and configure

multiple instances of the Event Collection service. While from an architectural perspective the

Workflow Management service deployment is limited to the default configuration of a single

instance per AppFabric node, the Event Collection service can be configured to use multiple

instances. For a description of the steps for configuring multiple Event Collection service

instances, refer to Create Multiple Event Collection Services. In addition, to increase the Event

Collection service throughput, multiple service instances should be used together with multiple

monitoring stores as explained later in this document.

14

IIS Application PoolsProper configuration of IIS application pools can help to optimize AppFabric performance.

Because AppFabric leverages WAS for the actual service hosting, the executable process and its

runtime settings are determined by the IIS application pool configuration. By default, when you

install AppFabric and subsequently deploy web applications and services, the DefaultAppPool will

be used for hosting all artifacts. Sharing an application pool across multiple applications means

that:

All applications share the same identity.

The limit for queued requests is shared across the applications.

Recycling an application pool for any reason, such as changing the configuration of a service

running under that application pool, will impact all web applications hosted by the application

pool. When planning how many application pools to have in your environment, also consider the

following:

Memory usage – A newly initialized application pool may take up to 25 MB of RAM when hosting

code-based WCF services, and up to 50-60 MB when hosting WF services. We recommend

that you empirically validate that your hardware can support the intended application pool

structure by deploying your solution to a test environment and monitoring the memory

utilization under load. If memory becomes a bottleneck, you can either add physical memory

or further group web applications and services under shared pools.

Manageability – If the environment has a large number (in the hundreds) of services and web

applications, managing that many dedicated application pools may be overwhelming to the

system administrator. Again, the solution is to reduce the number of dedicated pools by

logically grouping web applications and services under shared pools.

In general, in a large environment we recommend that you use a dedicated application pool for

hosting a logically related group of web applications and services. For example, it may be

acceptable that all purchase order processing services are hosted in one application pool, while

the payroll services have their own dedicated application pool. Through grouping of web

applications and services, you can achieve a good balance between manageability, runtime

isolation, and performance. Giving specific prescriptive guidance regarding shared versus

dedicated application pools is impractical due to the many factors and environment-specific

considerations that influence this decision. However, you should consider grouping applications in

shared application pools when possible.

Service Deployment Options and ConsiderationsAnother factor to consider when deploying services is the grouping of services in web

applications. For example, if the solution needs 50 services, do you create 50 web applications

with one service in each of them, or should several or all services be grouped in one or more web

applications? The main implications of a too-granular approach where each service has its own

web application are as follows:

15

Manageability – With too many applications to manage, much time is expended maintaining configuration settings and files. Trying to locate a specific service among many applications in the IIS Manager MMC can be a real challenge.

Performance – Both the Event Collection service and Workflow Management service subscribe to Web.config file change notifications to update the runtime services based upon a change in the configuration. Because AppFabric builds on top of the ASP.NET multi-level inheritance configuration infrastructure, when a file changes, a scan of the whole application node hierarchy is often required. The bigger the hierarchy is, the more noticeable the runtime configuration update impact is to the Event Collection service and Workflow Management service. This also results in higher CPU utilization and memory consumption.

Similarly to the application pool planning, we recommend that you logically group multiple

services into a web application where it makes sense. AppFabric is designed to handle hundreds

of web applications. However, for optimum performance and manageability, the goal should be to

keep the number of web applications as small as practically possible. Taking into account the

preceding points, along with the considerations from the preceding “IIS Application Pools” section,

the following diagram presents a common deployment topology containing multiple services in a

web application, and more than one web application sharing a common application pool.

The key points for this topology are:

16

Services that can be supported by a common monitoring and persistence configuration, and are logically related, are deployed to a single web application. In the preceding diagram, these are the grouped services in web applications 1 and 4, which host multiple instances.

Web applications that can use the same identity and can be restarted together after a configuration change share the same pool. In the preceding diagram, web applications 1 and 2 run under application pool 1. Different pools are provisioned where services require isolation in regards to identity and/or runtime availability.

The topology takes into account the application pools’ memory overhead by having a mix of shared and dedicated pools, thus reducing the total number of application pools (and memory pressure) in the IIS environment. This also improves manageability by minimizing the total number of high-level web applications to manage.

Data Storage Tier

The data storage tier consists of the storage for the AppFabric monitoring and persistence

data. For the purpose of this document we will assume a default and out-of-the-box AppFabric

installation, which provides a SQL Server schema, and persistence and monitoring providers. In

any large-scale enterprise-level environment, AppFabric should be installed and configured on

top of a high-performance, reliable data storage infrastructure. SQL Server failover clustering

provides hardware redundancy through a configuration in which vital shared resources are

automatically transferred from a failing computer to an identically configured server. That

typically means an active-passive or active-active SQL Server cluster consisting of a shared disk

and at least two physical servers. In an active-active cluster, if the active node fails, the other

node in the cluster then becomes the active SQL Server node. In the active-passive configuration,

one node is active and one is passive waiting to be used as a backup. If the main node fails, the

backup active node becomes the main active node. For more information on SQL Server

clustering, refer to Getting Started with SQL Server 2008 R2 Failover Clustering.

In this section we discuss:

SQL Server Platform Optimizations

AppFabric Data Stores

Multiple SQL Server Instances on the Same Server or Cluster

Multiple SQL Server Servers or Clusters

SQL Server Platform Optimizations

Although a default SQL Server installation provides a fully functional relational database

management system, post-installation optimizations can greatly increase performance and

alleviate common bottlenecks. In summary, the post-installation optimizations at the SQL Server

platform level are as follows:

17

TempDB data and log files should be placed on their own dedicated volumes, with temp data separate from temp log files.

TempDB should have as many data files as the number of CPU cores on the server.

Data and log files for all databases should reside on separate dedicated volumes.

Data and log file sizes, and auto-growth settings, should be preconfigured. For example, you might set the initial size of 25 GB for data files, 10 GB for log files, and 5 GB for auto-growth factor.

Trace flag T1118 should be enabled to reduce contention and achieve maximum concurrency.

Memory allocation to SQL Server should be preconfigured instead of using the default dynamic memory management.

More detailed step-by-step instructions for these recommendations can be found in the

Optimizing Database Performance section of the BizTalk Server Performance Optimization

Guide. This section provides comprehensive and mostly generic SQL Server guidance. A sample

data storage configuration chart is outlined in the following table.

Volume name Files LUN# or

ML_#

LUN size

GB

Cluster

size

Init size Auto

growth

Data_Sys MASTER,

MODEL and

MSDB data

files

1 10 64 KB 2 GB 1 GB

Logs_Sys MASTER,

MODEL and

MSDB log

files

2 10 4 KB 2 GB 1 GB

Data_TempDb TempDB Data

(x Number of

CPU cores)

3 20 64 KB 5 GB 1 GB

Logs_TempDb TempDB Log 4 20 4 KB 5 GB 1 GB

Data_ASPersistence1 Persistence

data files

5 20 64 KB 10 GB 5 GB

Logs_ASPersistence1 Persistence

log files

6 20 4 KB 10 GB 5 GB

Data_ASMonitoring1 Monitoring

data files

7 100 64 KB 25 GB 10 GB

Logs_ASMonitoring1 Monitoring

log files

8 25 4 KB 25 GB 5 GB

Data_CustomDBs Custom 9 Custom 64 KB Custom 5 GB

18

Volume name Files LUN# or

ML_#

LUN size

GB

Cluster

size

Init size Auto

growth

database

data files

Logs_CustomDBs Custom

database log

files

10 Custom 4 KB Custom 5 GB

AppFabric Data Stores

The default Windows Server AppFabric installation and configuration program yields one

persistence database and one monitoring database. In a high-throughput environment, both

stores may become a performance bottleneck due to disk I/O, SQL locks, and so on. AppFabric

offers great flexibility for the data topology by using multiple persistence and monitoring

databases. If used properly, this flexibility can help to ensure good performance.

Multiple Persistence StoresIt is important to understand the reasons why the persistence store can become a bottleneck in

the AppFabric environment. Durable WF services store their state in the AppFabric persistence

store. Most of the details for each persisted instance are stored in the Instances table and the

Keys table. In high-throughput scenarios many workflow instances are created (database inserts),

persisted (database updates), correlated (database reads/inserts/deletes), and completed

(database deletes) in a short period of time. These tables become a source of disk I/O contention

and database locks and latches. This bottleneck is depicted in the following diagram.

19

With the preceding information in mind, in scenarios where the persistence database can become

a bottleneck, we recommend that you create multiple persistence stores. Dedicate each store to a

particular service or group of logically related services. The process of creating a new persistence

data store is described in Create and Initialize a Database Using Windows Server AppFabric

Cmdlets. Conceptually, it can be depicted as follows.

20

The physical location of each additional database can be configured as follows, shown in low-

scalability to high-scalability order:

1. The same SQL Server instance and set of drives (data and log file volumes) as the initial AppFabric data stores

2. The same SQL Server instance with a different set of drives from the initial AppFabric data stores

3. Another SQL Server instance running on the same SQL Server cluster

4. Different SQL Server installations and/or clusters

The choice from the preceding options is specific to each environment and available hardware

resources. Performance counters related to CPU, memory, disk I/O, and SQL locks/latches

should be used to understand exactly where the bottleneck exists. These values also help in

deciding which option would best alleviate the problem.

There currently are no performance counters specific to Windows Server AppFabric.

Multiple Monitoring StoresThe AppFabric monitoring data store configuration and topology options are similar to those

presented earlier for the persistence database. Before going further into the discussion of

topology options and making the right choice, we will outline the flow of data through the

Note

21

monitoring store and point to the main sources of bottlenecks. Data captured by the Event

Collection service goes through the following simplified processing sequence:

1. Captured WCF and WF event data is written to a single staging table in the monitoring database.

2. A SQL Agent job runs frequently to check for new events, parses the event data, and moves it to the normalized WCF and WF event tables. These are used by the database views that provide data to the AppFabric Dashboard or any custom query/reporting technologies. The staging logic for WCF events is different than for WF events. This is because WCF events occur at random points in time. WF events are typically a part of a potentially long-running chain of events for the workflow instance. In that case event correlation, temporal state, and consistency are also important.

a. For pure WCF service events, the SQL Agent job processes staging records in bulk from the staging table to the normalized WCF Events table. This is a relatively fast batch process operation.

b. Conversely for WF events, the SQL Agent job may have to execute logic per staging record (event) before it can move the data to the normalized WF Events table. This translates into longer processing times.

3. The data is inserted into the normalized WCF Events and WF Events tables.

This process is shown in the following diagram.

Performance tests have shown that the SQL Agent job is capable of processing between 3,500

and 4,500 staging records (events) per second on a 4 quad-core CPU BL680c server with 32 GB

of RAM and a disk storage configuration aligned with the sample storage configuration presented

earlier in the document in the SQL Server Platform optimizations section.

As a reference, using the Health Monitoring level, a stateless short-running workflow service with

six activities generates about 13-14 tracking events (resulting in that many staging records). This

means that the staging table incoming records rate will break even with the SQL Agent staging

22

job processing (drain) rate at ~285 service calls per second (4,000 drain rate / 14 events per

workflow instance = ~285 workflow instances). A higher throughput rate will start building a

backlog in the staging table.

For pure code-based WCF services, the AppFabric Health Monitoring level aggregates

the operation call statistics by default prior to sending the tracked information to the

monitoring store. The default sampling/aggregation rate is every five seconds, which

results in a single event being emitted to the monitoring store for each service operation

called during that period.

If a single WF service within the environment has a constant load of 280-300 calls/sec, using a

monitoring level above Errors Only will build a staging records backlog. In that case, the options

are limited to the following:

Turn the monitoring off for this service. This can be done either by using the AppFabric management UI or the Set-ASAppMonitoring Windows PowerShell cmdlet with the MonitoringLevel parameter set to “Off”.

Reduce the monitoring level to Errors Only for this service. This can be done either by using the AppFabric management UI or the Set-ASAppMonitoring Windows PowerShell cmdlet with the MonitoringLevel parameter set to “ErrorsOnly”.

Define a custom tracking profile starting from the built-in health tracking profile and only including the events of interest, instead of capturing events from all activities in the workflow. For information about creating a custom tracking profile, refer to Configure Tracking.

If a number of services jointly contribute to an incoming staging records rate higher than the

backlog threshold, the best option is to provision multiple monitoring stores and configure the

services to capture their tracking data into different monitoring stores, as depicted in the following

diagram.

Again, the physical location of the additional data stores may vary between the same SQL Server

and different disk volumes, a different SQL Server instance on the same server, or a completely

different SQL Server installation. Because each monitoring database will have a corresponding

Note

23

SQL Server Agent job, the staging job throughput can be increased to meet the throughput

requirements for the services it supports.

With this design, an important consideration to keep in mind is that the AppFabric End-to-End

Activity monitoring level only works for services that use the same monitoring store, which

obviously does not apply here. A hybrid topology can be adopted to enable end-to-end activity

tracking where a logical group of services logs its tracking data to a single store, while other

services from the same environment that are not related log their data to other/additional

monitoring stores.

Multiple SQL Server Instances on the Same Server or Cluster

As previously mentioned, depending on available hardware resources, the persistence and

monitoring stores can be deployed to separate SQL Server instances running on the same

physical server, or to an active-passive or an active-active SQL Server cluster. This configuration

is depicted in the following diagram.

The advantages of this configuration are complete process (at the SQL Server level) isolation for

reliability, and increased security. A typical scenario for such a deployment is where the enterprise

IT department supports multiple departments within the organization, each with dedicated

persistence and/or monitoring stores.

24

Multiple SQL Server Servers or Clusters

An organization may choose to distribute the AppFabric persistence and monitoring stores onto

different and independent SQL Server environments (servers or clusters), as depicted in the

following diagram.

This may be beneficial if the company chooses to have a dedicated data warehouse-like

repository of the monitoring data, while the persistence store is deployed to the company’s online

transaction processing (OLTP) infrastructure.

Summary

Designing a scalable and well-performing Windows Server AppFabric deployment depends upon

numerous factors. Most of the key points and considerations have been presented in this guide,

and are summarized in priority order as follows:

Optimization of the storage platform installation at the system level — for example, placing data and log files on separate volumes

Choice of dedicated versus shared databases for the persistence and monitoring stores, and the SQL Server installation where they reside

Special attention to the staging data processing limitations in the monitoring store

Choice of dedicated or shared application pools, and web application/service deployment topology

Taking the information provided in this document into account during the planning of an AppFabric

deployment will help to ensure smooth and scalable deployment, with most common bottlenecks

already addressed by the design. The remainder comes down to good application logic design

and implementation.

25