Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Everything you wanted to know about
Velocity
(but were afraid to cache)(but were afraid to cache)
Scott Colestock
Marcato Partners, LLC
What is it?
Velocity is a distributed in-memory key/value cache that provides .NET developers with a way to increase
performance and scalability when writing data-centric applications.
What is it? (2)
• The combined RAM available to all servers in a
Velocity cluster is presented to Velocity clients
as a unified whole
• Any serializable CLR object can be stored• Any serializable CLR object can be stored
– Actual location within cluster is transparent
– Client is a simple key/value API at heart
• Run as a service accessed across the network
• Additional servers can be added on demand
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Motivation
• Data-centric applications have been the norm for a long while– Relational data
– More recently, “service-obtained” data
• Velocity is about increasing performance by bringing the data physically closer to the consumer
• Velocity is about increasing performance by bringing the data physically closer to the consumer– Reduce pressure on underlying data stores/services
• Velocity can be about storing data in value-added form (logically closer to the consumer)– Object graphs
– Output caching (not explicit in V1)
– Aggregated data in xml or other transformed formats
Motivation (2)
• Databases are always a point of high contention
as you scale out, and tuning is expensive
– Are your data retrieval sprocs getting harder to
maintain - excessive sql chops required?maintain - excessive sql chops required?
• Service calls for reference data (internal/external)
are often slow or intentionally throttled
• Caching has always been considered a solution
for these issues…
Motivation (3)
• Machine-local caching solutions (like Microsoft’s “Enterprise Library Caching Application Block”) can provide partial answer– Easy key/value API
– Flexible store (memory, disk-backed, etc.)
– Flexible expiration and eviction policy– Flexible expiration and eviction policy
• Limitations:– Limited by the memory available to a single node…
– Application recycles typically mean you lose the cache
– In a load-balanced environment, a large data set means you will frequently “miss” when attempting to load from cache…
Motivation (4)
Key 3,5,23
Machine-local caches wind
up being sparsely populated
when used with a load
balancer (if the data set has
many keys)
Load Balancer
Key 7,11,47
Key 12,16,33
Motivation (5)
• Without a distributed cache, you have no central place to update/delete
• This means you can only cache data that can afford to be stale by some time period
– If the time period is short, you need a low TTL (time-to-– If the time period is short, you need a low TTL (time-to-live, aka expiration) which means more cache misses
• You can’t cache data that must have changes visible to the system in (near) real time
• With a distributed cache, you have one cache to shoot in the event of an update/delete
– Might be able to live with no expiration
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Windows Server AppFabric Caching
• History: AppFabric caching was a separate component
– Public debut at TechEd 2008 (earlier?)
– Codename: Velocity– Codename: Velocity
• “Dublin” was a separate effort, focused on providing a hosting and management environment around WCF/WF
• November 2009: Technologies grouped under heading of “Windows Server AppFabric”
Relationship to Windows Azure
AppFabric• Service bus: Handle communication and authentication
for accessing applications– Expose apps through firewalls, NAT gateways, etc.
– Assist cloud-based apps talking to on-premise apps
– Other composite app scenarios; pub/sub
• Access Control Service: Allow you to avoid setting up • Access Control Service: Allow you to avoid setting up federated identity agreements just to grant partner/customer access to your cloud-based or on-premise apps.
•Today: Only common
marketing/branding with Windows
Server AppFabric.
•Later: Common services for both
Cache-Aside Pattern
• In the current version, the out-of-box support
is for the “cache-aside” pattern.
– Check cache
– If miss, retrieve data, then populate the cache– If miss, retrieve data, then populate the cache
• Lots of other patterns you might contemplate
(and simulate) with what is provided
– Read-through/Write-through
– Refresh-ahead/Write-behind
Cache-Aside Pattern
Cache Cluster
Logical Hierarchy
Server A
Cache Host A
Server B
Cache Host B
Server C
Cache Host C
Client apps work with a
single logical unit of cache
Regions can
be implicit
or explicit.
Use explicit
only for
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1 Region 3
Server process is
DistributedCacheService.exe
Caches
explicitly
created
with TTL,
expiration,
HA policy
Regions represent a partition of
data (subset of key/value pairs).
Live on one node. Unit of
replication/failover.
only for
bulk gets or
searching.
Logical Hierarchy
ID (Key) Payload
(Value)
Tags/VersionInfo
1 Foo …
2 Bar …
3 Baz …
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1
Cache Cluster
Physical Layout
Web Server A
IIS 7.x
Web Server B
IIS 7.xLoad
Balancer
Cache Server A
Cache Host
Cache Server B
• Cache servers designed to run in a domain
• Caches can have access control applied…
• Consider the nature of data stored in cache, and secure appropriately (don’t let cache be weakest link)
IIS 7.x
Web Server C
IIS 7.x
BalancerCache Host
Cache Server C
Cache Host
Combined Deployment
Web Server A
IIS 7.x
Web Server B
Cache Host
Web Server B
IIS 7.x
Web Server C
IIS 7.x
Load
Balancer Cache Host
Cache Host
Physical LayoutCache Cluster
Web Server A
IIS 7.x
Web Server B
IIS 7.xLoad
Balancer
Cache Server A
Cache Host
Cache Server B
Cache Host
Config
Store
(File share or
Sql Server)
• Configuration store contains cache policies and global partition map (how keys divide into regions, which servers have which regions)
• If Sql config store, servers will send heartbeat to Sql. Otherwise, heartbeat goes to one or more “lead hosts”
• Partition map used by “Global Partition Manager” (one node in the cluster, but auto failover) to communicate routing information to Velocity clients
Web Server C
IIS 7.x
Cache Host
Cache Server C
Cache Host
Sql Server)
Regions as unit of replication/failover
(Global Partition Manager in action)
Cache Cluster
Server A
Cache Host A
Server B
Cache Host B
Server C
Cache Host C
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1
Regions as unit of replication/failover
(When using Secondaries)
Cache Cluster
Server A
Cache Host A
Server B
Cache Host B
Server C
Cache Host C
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1
Sports secondary
Region 1 secondary
(Updates done synchronously)
Local CacheCache Cluster
Web Server A
IIS 7.x
Web Server B
IIS 7.xLoad
Balancer
Cache Server A
Cache Host
Cache Server B
Cache Host
Local
Cache
Local
Cache
• Local cache is an option that can be enabled when creating the cache client (DataCacheFactory)
• Allows a local cache to be populated that will prevent network hop (and serialization) if request
can be satisfied locally
• Best when data set is (relatively) small, changes infrequently, and stale data is acceptable
• Can expire via TTL or notifications (which might be late/lost)
• Can specify max object count before evicting LRU
Web Server C
IIS 7.xCache Server C
Cache HostLocal
Cache
Data Types and Caching
Considerations• Reference Data: Product catalogs, “lookup” tables, other
slow-moving content– Safe to cache for a defined period of time because you probably
live with staleness already
– “Local” cache option might be desirable for small data sets
• Activity Data: Shopping carts or other transient transaction • Activity Data: Shopping carts or other transient transaction state– Accessed for read and write operations, but not shared.
Low/No concurrency considerations – exclusive write.
– Safe to cache for reads and keep in cache for writes
• Resource Data: Inventory, Orders, and other core transactional data– Accessed concurrently for read and write
– Caching will require a concurrency model to be chosen and managed
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Deploy/Install Considerations
• Windows “Application Server” Role required
• Hotfix required for Vista/Win2k8; not for Win7/Win2k8R2
• You’ll need Powershell 2 (already in Win7/Win2k8R2)
• You’ll need Powershell 2 (already in Win7/Win2k8R2)
• .NET3.5SP1 for cache clients; .NET4 for servers
• Windows XP cannot be a client…
• “Install” and “Configure” for AppFabric are two distinct steps (much like BizTalk)
Deploy/Install Considerations
• Primary screen of
interest is choosing your
configuration store:
– XML/File share
– Sql-Based
• File share avoids the
need for Sql Server, but
requires that some requires that some
nodes in the cache
cluster be special (“Lead
Hosts”)
• Using Sql as the
configuration store is
the better engineering
choice for production –
you may have other
reasons to avoid it.
Deploy/Install Considerations
• As you build out your Velocity Cache Cluster,
you will do “New Cluster” on the first node,
and “Join Cluster” on subsequent nodes
• Ultimately, all of Windows Server AppFabric is • Ultimately, all of Windows Server AppFabric is
a set of features underneath the Application
Server Role – so standard command line
installations work.– Setup.exe /i CacheAdmin,CacheService,CacheClient
AppFabric as Application Server
“Role Service”
Deploy/Install Considerations
• Can do a “Cache client” install for clients, or
for internal apps, just incorporate client
assemblies in your own build/deploy processMicrosoft.ApplicationServer.Caching.Core.dll
Microsoft.ApplicationServer.Caching.Client.dllMicrosoft.ApplicationServer.Caching.Client.dll
Microsoft.WindowsFabric.Common.dll
Microsoft.WindowsFabric.Data.Common.dll
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Caching Classes
DataCacheFactory
DataCacheFactory()
DataCacheFactory(configuration)
DataCache GetCache(string cache)
GetDefaultCache()
DataCache
Add
Adds a new object to the
cache. Exception if the item
is already in the cache.
DataCacheFactoryConfiguration
LocalCacheProperties
NotificationProperties
SecurityProperties
DataCacheServerEndpoint[] Servers
(Can set these via configuration)
is already in the cache.
Put
Adds a new object to the
cache. Replaces if already in
cache.
GetReturns an object from the
cache.
RemoveRemoves an object from the
cache.
Caching Classes
DataCache with DataCacheItemVersion
• GetCacheItem: returns tags and version info
• GetIfNewer: lets you use that version info!
• Put and Remove have overloads that takes
version infoversion info
– Allows for an optimistic concurrency model
– Will only succeed if version information matches
what is current for the cached item
DataCache and Locking
• GetAndLock: Allows you to lock a cache item
for a specified time period, even if not present
– (Will fail if already locked)– public Object GetAndLock (string key, TimeSpan timeout, – public Object GetAndLock (string key, TimeSpan timeout,
out DataCacheLockHandle lockHandle, bool forceLock)
• PutAndUnlock: Unlock an item, with given key
and lock handle
• Unlock: Explicitly unlock, optional extend TTL
DataCache and Tags/Regions
• Explicitly created regions live on a single
node…can create a hot spot for both call
volume and memory growth
• But they offer bulk retrieval and flexible tag-• But they offer bulk retrieval and flexible tag-
based retrieves
• Instead of regions: can simulate secondary
indexes with your own secondary-to-primary
mapping
Administrative Model
• Administration for AppFabric Caching done purely through PowerShell
• Can administrate entire Cache Cluster from wherever administrative portion of install has wherever administrative portion of install has been done – all nodes addressable from single command line location
• Use-CacheCluster points the shell at a particular cluster to administrate
• Remember: Get-CacheHelp ☺
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Gotchyas
• Not a gotchya: AppFabric provides a SessionStoreProvider class that plugs into the ASP.NET session storage provider model
• Balance number of nodes in cluster with memory per node. – Too many nodes = cluster overhead, too much memory per node = GC
overhead
• If you don’t use Sql Config Store, you need to manually run Start-CacheHost after rebootCacheHost after reboot
• Sql Config Store requires high Sql privileges right now at point of install
• Currently service runs as network service account
• Consider what you will do when cache is down– You can go after source of truth
– How do you avoid leaving stale data in the cache?
Thank you -
Questions?