Memcached Talk

Preview:

DESCRIPTION

A talk on using Memcache in Ruby, given to the Ruby-on-Rails Sydney Group.

Citation preview

MemcacheRob Sharp

rob@sharp.id.au

Lead DeveloperThe Sound Alliance

About Memcached

• conceived by Brad Fitzpatrick as a solution to the scaling issues faced by Livejournal

• “memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load”

Who we are

• Sydney-based online media publishing

• Community and Content Sites

• Fasterlouder.com.au

• Inthemix.com.au

• Samesame.com.au

• Thoughtbythem - Marketing Agency

• White Label Gig Ticketing

Who we are

• Inthemix.com.au

• Australia’s busiest music website

• ~ 250,000 pages per day

• Plus two other busy sites!

• Maximum performance for the hardware we have

Current Architechture

• 3 Linux servers

• Apache

• Lighttpd

• Memcache

• 1 MySQL Master

Why do we use memcached?

• CMS written in OO style, using Activerecord

• PHP4 and objects don't mix too well

• Activerecord gave us fast development but reduced performance

• Call us greedy, but we want both

• Use Rails Memcache!

Our Application

• CMS written from the ground up

• Effectively three sites running on one codebase

• Uses three seperate databases, but aiming to consolidate more

• Has data namespacing implemented in most places

• But seperation is not quite there yet!

Our Memcache Setup

• We have 3 webservers running memcache

• Each server runs three daemons on separate ports - one for each site (more on this later!)

Memcache Pool• Each daemon knows about the other 2

daemons and connects to them over TCP

• This allows us to store data once, and access it from any server, whether in the pool or not

• Hashing algorithm means that a given key maps to a single server

• Efficient use of memory

• Efficient for cache clearing

Memcache Pool

• But what if we lose a server? We can

• Ignore it - we simply get misses for any keys we attempt to retrieve

• Remove it - our hashing algorithm breaks... :(

• We can also add new servers to the pool after data has been stored, but the same hashing problem occurs

Memcache Pool

• Consistent hashing will solve the problem of removing or adding servers once data has been hashed

• Currently in its infancy - not really production ready

• We simply monitor our daemons and restart if required

Installing Memcached

• Available in most Linux distros

• packaged for Fedora, RHEL4/5, Ubuntu, Debian, Gentoo and BSD

• OSX? Use Ports!

• sudo port install memcache

• sudo gem install memcache-client

• sudo gem install cached_model

Memcache and Ruby

• We’ll use the memcache-client gem

• Pure Ruby implementation

• Pretty fast!

Storing Stuffrsharp$ sudo gem install memcache-client

require 'memcache'

memcache_options = { :compression => true, :debug => false, :namespace => 'my_favourite_artists', :readonly => false, :urlencode => false}

Cache = MemCache.new memcache_optionsCache.servers = 'localhost:11211'

Storing Stuff

Cache.set 'favourite_artist', 'Salvador Dali'skateboarder = Cache.get 'favourite_artist'

Cache.delete 'favourite_artist'

Memcache Namespaces

• Memcache doesn’t have namespaces, so we have to improvise

• Prefix your keys with a namespace by setting the namespace when you connect

• Our solution:

• Run multiple memcache instances on different ports

Roll your own?

• Memcache-client provides basic cache methods

• What if we extended ActiveRecord?

• We can, with active_model

Storing Stuff Part Deuxrsharp$ sudo gem install cached_model

require 'cached_model'

memcache_options = { :compression => true, :debug => false, :namespace => 'hifibuys', :readonly => false, :urlencode => false}

CACHE = MemCache.new memcache_optionsCACHE.servers = 'localhost:11211'

Storing Stuff Part Deux

class Artist < CachedModel

end

cached_model Performance

• CachedModel is not magic.

• CachedModel only accelerates simple finds for single rows.

• CachedModel won’t cache every query you run.

• CachedModel isn’t smart enough to determine the dependencies between your queries so that it can accelerate more complicated queries. If you want to cache more complicated queries you need do it by hand.

Other options

• acts_as_cached provides a similar solution

Memcache Storage

• Memcache stores blobs

• The memcache client handles marshalling, so you can easily cache objects

• This does however mean that the objects aren’t necessarily cross-language

Memcache Storage

• The most obvious things to store are objects

• We cache articles

• We cache collections of articles

• We cache template data

• We cache fragments

• We don’t cache SQL queries

What we cache

• Our sites are fairly big and data-rich communities

• Almost every page has editorially controlled attributes along with user generated content

• Like...

Our Example Dataset

• Article

• Joins Artists

• Joins Locations

• Joins Genres

• Joins Related Content

• Joins Related Forum Activity

• Joins Related Gallery Data

Our Example Dataset

• Article (continues)

• ...

• Joins Media Content

• Joins Comments

• Joins ‘Rollcalls’

• Joins other secret developments

Our Example

• An article requires many data facets

• Most don’t change that often

• We also know when they change

• Yay for the Observer pattern

• User content changes much more regularly

• Can be changed from outside our controlled area (e.g. Forums)

Our ExampleSummary

• Data can be loosely divided into editorially controlled and user-generated

• Cache editorially controlled content separately from user-generated content

• Simplest way to implement is in fragment caching

Fragment Caching

• Memcache allows timed expiry of fragments

• Identify areas that change infrequently and cache

• Remember to measure performance before and after

• Evidence suggests very large gains!

• Use memcache_fragments

Caching Fragmentsrsharp$ sudo gem install memcache_fragments

require 'memcache_fragments'

memcache_options = { :compression => true, :debug => false, :namespace => 'hifibuys', :readonly => false, :urlencode => false}

CACHE = MemCache.new memcache_optionsCACHE.servers = 'localhost:11211'

Caching Fragments

ActionController::Base.fragment_cache_store = :mem_cache_store ,{}ActionController::Base.fragment_cache_store.data = CACHE, {}ActionController::CgiRequest::DEFAULT_SESSION_OPTIONS.merge!({ 'cache' => CACHE })

Caching Fragments

<% cache 'my/cache/key', :expire => 10.minutes do %> ...

<% end %>

Memcache Sessions

• We could store our session in Memcache

• Great for load balancing - share across a server farm without using a DB store

• Ideal for transient data

• Solution exists in db_memcache_store

• DB backend with memcache layer - the best of both worlds

In Summary• Memcache gives you a distributed cache

store

• Very fast and very easy to use

• Lots of ruby and rails libraries

• memcache_client

• cached_model

• db_memcache_store

• memcache_fragments

Any Questions?

Recommended