Upload
leszek-urbanski
View
7.134
Download
7
Embed Size (px)
DESCRIPTION
The unabridged version of the presentation on Varnish for PLNOG 4.
Citation preview
A modern HTTP acceleratorfor content providers
Leszek UrbańskiTrader Media East Competence Center
unabridged versionPLNOG 4 – Warsaw, 2010-03-05
Modern web apps
● you know this picture...
Modern web apps
● you know this picture...
Modern web apps
● a nice setup...
Modern web apps
● ...but the application is still slow.
● you need efficient web caching, before the traffic hits your app
● CDNs? Hardware? Squid?
Web caching
● CDNs● expensive● you are completely dependent on a CDN's service
● hardware
● nice, but...● $30,000 for one BIG-IP 3600 without redundancy,
support and only 50 Mbps compression with the standard licence
Web caching
● Squid● a forward proxy/cache with optional reverse
proxying (HTTP acceleration)
● huge config files full of forward proxy options
● it's slow
● “1970s programming”
http://varnish-cache.org/
Varnish
● a state-of-the-art reverse proxy and cache
● open source, initially developed for a Norwegian tabloid “Verdens Gang” in 2006
● Poul-Henning Kamp – architect and lead developer
● Linpro AS
Varnish
● used by TOP100 sites● Twitter● Photobucket● weather.com● answers.com● Hulu● Wikia
● source: Ingvar Hagelund http://users.linpro.no/ingvar/varnish/stats-2010-01-18.txt
Varnish
● used by only one Alexa TOP100 site in Poland● Gadu-Gadu
Architecture
● Varnish does not fight the OS kernel!
● uses virtual memory, two main stevedores:● mmap()● malloc()
● scales well in SMP environments● event-based acceptor● multi-threaded worker model
Architecture
● avoids expensive memory operations● workers used in the MRU order, session lingering
● a worker has a private set of variables on the stack
● static buffers – reused
● uses jemalloc library. No noticeable difference with Google's tcmalloc
Architecture
● workspaces● operate on pointers, do not copy data● malloc() only for the workspaces
● obj_workspace – per object, for request/response headers and metadata. Watch out for very large headers/cookies!
● sess_workspace – per thread, for request processing
● shm_workspace – for SHM logging
Architecture
● SHM logging● an mmap()ed file shared by all threads and logging
programs
● logging without syscalls!
memcpy(p + SHMLOG_DATA, t.b, l);/* or */vsnprintf((char *)(p + SHMLOG_DATA), mlen + 1, fmt, ap);
Architecture
● object eviction from a LRU list● the list requires locking for writes
● an object is only moved in the LRU list if it hasn't been moved for the last lru_interval seconds
● hitpass objects
Architecture
● efficient object purging - “ban list”● need to purge 200,000 objects from the cache
without overloading the server?
● Varnish keeps a list of purges● every object is tested against the list, but only if
requested by a client● if it matches, it is refreshed from a backend● its “last tested against” pointer is updated
Architecture
● results?● microsecond-level response for cached objects
● good even for static content
● performance limit currently unknown :-)● 75,000 reqs/s achieved at TMECC● 143,000 reqs/s achieved by Kristian from Redpill-
Linpro
Architecture
● serving a request from cache:<... futex resumed> ) = 0 <0.629910>futex(0x7f2a577fe2e8, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000011>ioctl(9, FIONBIO, [0]) = 0 <0.000011>read(9, "GET /logo.png HTTP/1.0\r\n (...) 8191) = 177 <0.000016>clock_gettime(CLOCK_REALTIME, {1265632945, 828835974}) = 0 <0.000011>clock_gettime(CLOCK_REALTIME, {1265632945, 828986444}) = 0 <0.000011>clock_gettime(CLOCK_REALTIME, {1265632945, 829032564}) = 0 <0.000010>writev(9, [{"HTTP/1.1"..., 8}, (...) 12912}], 32) = 13227 <0.000039>clock_gettime(CLOCK_REALTIME, {1265632945, 830411262}) = 0 <0.000011>close(9) = 0 <0.000019>futex(0x44884bf4, FUTEX_WAIT_PRIVATE, 239, NULL <unfinished ...>
● 10 system calls, 4 for clock
Features
● run-time management and reconfiguration
$ telnet localhost 6082Trying 127.0.0.1...Connected to localhost.Escape character is '^]'.vcl.list200 23active 7 boot
vcl.load new1 /etc/varnish/default.vcl200 13VCL compiled.vcl.use new1200 0
Features
● comprehensive logging and management● varnishadm● varnishlog● varnishncsa● varnishtop● varnishstat● varnishhist● varnishreplay● varnishtest
Features
● logging examples● tags
varnishtop -i RxURL
varnishtop -i TxURL
varnishtop -i RxHeader -I '^User-Agent'
varnishlog -c -o ReqStart 10.0.0.1
varnishlog -b -o TxHeader '^X-Forwarded-For: .*10.0.0.1'
Features
● varnishstat – real time statisticsclient_conn 87737603 99.74 Client connections acceptedclient_req 335496200 381.40 Client requests receivedcache_hit 307936704 350.07 Cache hitscache_hitpass 811746 0.92 Cache hits for passbackend_conn 12311926 14.00 Backend conn. successn_object 549675 . N struct objectn_wrk 100 . N worker threadsn_expired 23826372 . N expired objectsn_lru_nuked 0 . N LRU nuked objectsn_wrk_failed 0 0.00 N worker threads not createds_req 335510357 381.41 Total Requestss_pass 2947900 3.35 Total passs_fetch 27317481 31.05 Total fetchsma_nbytes 6661407561 . SMA outstanding bytessma_balloc 2173616292374 . SMA bytes allocatedsma_bfree 2166954884813 . SMA bytes freebackend_req 27318738 31.06 Backend requests madeesi_parse 0 0.00 Objects ESI parsed (unlock)esi_errors 0 0.00 ESI parse errors (unlock)
Features
● timing information type XID start time
830 ReqEnd c 877345549 1233949945.075706005 1233949945.075754881 0.017112017 0.000022888 0.000025988
end time accept()-processing processing-delivery delivery time
Features
● backend load balancing – directors● round-robin● random
● backend health polling – using new connections● grace● URL serialization● IPv6 support
Features
● no forward-proxy support – can be done, but with huge amount of configuration magic
● flexible purgingpurge req.http.host == foobar.com && req.url ~ ^/directory/.*$purge obj.http.Cookie ~ example=true
● ESI support
Features
● Edge Side Includes● a markup language for dynamic content assembly
● used by Akamai, IBM WebSphere, F5, Varnish
● without ESI: page-level caching decisions
● with ESI: a page can be split into separate blocks and assembled by the cache server
Features
● Edge Side Includes● Varnish implements a small subset of ESI● no compression support yet● no If-Modified-Since support yet
<esi:include src="/esi/hot_news.html"/>
<esi:remove><a href="/something">something</a></esi:remove>
<!--esi<p>Hot news:<esi:include src="/hot_news.html"/></p>-->
Features
● VCL – Varnish Configuration Language● a domain-specific language● translated to C and compiled● dynamically loaded● similar to C, Perl● = == ! && || ~ !~● character escaping like in URLs: %nn● no user-defined variables, use HTTP headers:
set req.http.something = "";unset req.http.something;
Features
● VCL – Varnish Configuration Language● “normal” “concatenated” “strings” or
{"stringstring"}
synthetic { “string” }
● if () {} elsif {}● no loops● include “file.vcl”;● regsub(), regsuball()
Features
● VCL – Varnish Configuration Language● user-definied subroutines
sub f {do_magic;
}
call f;
● no arguments / return values in subs● return(); exclusive to internal VCL functions● special variables: now (unix time), client.ip,
server.ip, server.port, server.identity
Features
● VCL – Varnish Configuration Language● ACLs
acl localnet {“localhost”;“10.0.0.0/24”;! “10.0.0.1”;
}
if (client.ip ~ localnet) {do_magic;
}
● security.vcl● if everything else fails... embedded C!
Features
● embedded C in VCL● example: syslog logging from VCL (don't :-)
C{ #include <syslog.h>}C
C{syslog(LOG_INFO, "Something happened at VCL line XX.");
syslog(LOG_ERR, "Response from backend: XID %s request %s %s \"%s\" %d \"%s\" \"%s\"", VRT_r_req_xid(sp), VRT_r_req_request(sp), VRT_GetHdr(sp, HDR_REQ, "\005Host:"), VRT_r_req_url(sp), VRT_r_obj_status(sp), VRT_r_obj_response(sp), VRT_GetHdr(sp, HDR_OBJ, "\011Location:"));}C
VCL
● request path through VCL● vcl_recv● vcl_pipe● vcl_pass● vcl_hash● vcl_{hit,miss}● vcl_fetch● vcl_deliver● http://varnish-cache.org/wiki/VCLExampleDefault
● this graph is oversimplified!
VCL
● vcl_recv● called at the beginning, after the request has been
received● possible returns: error, pass, pipe, lookup● example variables: req.request, req.url, req.proto,
req.backend, req.backend.healthy, req.http.Header
if (req.host == “static.foo.com” && req.url ~ “^/static/.*”) { set req.backend = cluster1;} else { error 404 “Unknown virtual host”;}
VCL
● vcl_pipe● called when entering pipe mode● shifts bytes back and forth (client backend)↔● possible returns: error, pipe● example variables: bereq.request, bereq.url,
bereq.proto, bereq.http.Header● timeouts
VCL
● vcl_pass● called when entering pass mode
● the request is passed to the backend without caching
● possible returns: error, pass
● example variables: bereq.*
VCL
● vcl_hash● called on object lookup● generates a user-configurable object hash● possible returns: hash● example variables: req.hash
vcl_hash {if (req.url ~ “^/content” && req.http.Cookie ~ “adult=true”) { set req.hash += “adultContent”; set req.http.X-Adult-Content = “1”;}
}
VCL
● vcl_hit● called after lookup when hit● possible returns: error, pass, deliver● example variables: obj.hits, obj.ttl● caveat: do not modify the object here!● example: adaptive TTLs:
if (req.http.host ~ “^images\.”) { if (obj.hits > 5 && obj.hits < 10) {
set obj.ttl = 8h; } elsif (obj.hits >= 10) {
set obj.ttl = 2d; }}
VCL
● vcl_miss● called after lookup when missed
● possible returns: error, pass, fetch
● example variables: bereq.*
VCL
● vcl_fetch● called after the object has been fetched● possible returns: error, pass, deliver, esi● example variables: obj.hits, obj.proto, obj.status,
obj.response, obj.cacheable, obj.ttl, obj.lastuse● obj.cacheable means: obj.status is 200, 203, 300,
301, 302, 410 or 404● forced obj.ttl first set here● obj. called beresp. in trunk
VCL
● vcl_fetch● ESI processing takes place here
sub vcl_fetch { if (req.url ~ "/esi/" || obj.http.X-ESI) { esi; set obj.ttl = 1d; } elsif (req.url == "/counter.cgi") { set obj.ttl = 1m; }}
<!-- /esi/example.html -->
<esi:include src="/counter.cgi"/>
VCL
● vcl_deliver● called before delivery to the client● possible returns: error, deliver● example variables: resp.proto, resp.status,
resp.response, resp.http.HEADER● modify headers for the client here
set resp.http.X-Served-By = server.identity;if (obj.hits > 0) { set resp.http.X-Varnish-Hit = “HIT”; set resp.http.X-Varnish-Hits = obj.hits;} else { set resp.http.X-Varnish-Hit = “MISS”;}
VCL
● vcl_error● called on errors● possible returns: deliver● example variables: req.*, obj.*● customizing error pages:
sub vcl_error {if (req.url ~ “^/MONITOR.txt$”) { synthetic {“MONITOR
“}; deliver;}
}
VCL
● restarts● the “restart” keyword turns the request all the
way back to vcl_recv, available everywhere
sub vcl_fetch { if (obj.status >= 500) { restart; }}
sub vcl_error { if (obj.status == 500 && req.restarts < 4) { restart; }}
VCL
● restarts● the “restart” keyword – you can even try another
data center
sub vcl_recv { if (req.restarts == 0) { set req.backend = data_center_1; } elsif (req.restarts == 1) { set req.backend = data_center_2; }}
VCL
● things to remember● req. data structure available throughout the VCL
(except in vcl_deliver)● do not modify objects in vcl_hit (except for TTL)● if unsure, translate the VCL to C
varnishd -C -f file.vcl
● look for VRT_count(sp, X) for ordering● if you don't return in a vcl_*, default VCL for that
function is appended
VCL examples
● purging, “the squid way”sub vcl_recv {
if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed"; } lookup; }}sub vcl_hit {
if (req.request == "PURGE") { set obj.ttl = 0s; error 200 "Purged";}
}sub vcl_miss {
if (req.request == "PURGE") { error 404 "Not found";}
}
VCL examples
● saint mode (trunk only)● do not send errors to clients
sub vcl_fetch { if (beresp.status >= 500) { set beresp.saintmode = 20s; restart; } set beresp.grace = 30m;}
● saint mode will disable a backend for a specified period of time
● if all backends are unavailable - grace
VCL examples
● force grace on errorvcl_error {
if (req.restarts == 0) { set req.http.X-Serve-Graced = "1"; restart;
}}
vcl_recv {if (req.http.X-Serve-Graced && req.restarts == 1) { set req.backend = dead;
}
● define a “dead” backend with health polling● when restarted from vcl_error, graced content will
be served
VCL examples
● URL rewritingif (req.http.host ~ "^(www\.)?foo" && req.url ~ "^/images/") { set req.http.host = "images.foo"; set req.url = regsub(req.url, "^/images/", "/");}
● redirects (a bit of a hack)sub vcl_recv { if (req.http.host = "^(www\.)?foo.com" && req.http.User-Agent ~
"iPhone|Nokia|Motorola") { error 701 "Moved temporarily"; }}sub vcl_error { if (obj.status == 701) { set obj.http.Location = "http://m.foo.com/"; set obj.status = 302; deliver; }}
VCL examples
● caching publicly available authorized pages sub vcl_fetch {
if (obj.http.Authorization && !obj.http.Cache-Control ~ "public") {
pass;}
}
● caching logged in users (be careful!)● http://varnish-cache.org/wiki/VCLExampleCachingLoggedInUsers
● possible with per-user caching and careful use of ESI
● separate “(not) logged in” objects from “logged in as...”
VCL examples
● cookie based hashingsub vcl_hash {
if (req.http.Cookie ~ "language=esperanto" ) { set req.hash += "LangEsperanto";
}}
● result: a separate cached version of the object for requests with Cookie: language=esperanto;
● extracting the value of a cookie● nothing more than a regexp
regsub(req.http.Cookie, "^.*?cookie=([^;]*);*.*$", "\1");
VCL examples
● serving synthetic responsesif (req.url ~ "^/MONITOR.txt") { error 200 "OK";}
● allowing reloads from browsers without purgingif (req.http.Cache-Control ~ "(no-cache|no-store|private)") { pass;}
● watch out for nasty bots!● passing everything for secure URLsif (req.url ~ "^/secure") { pass;}
VCL examples
● normalizing Accept-Encoding headers for compression
if (req.http.Accept-Encoding) { if (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; } elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; } else { remove req.http.Accept-Encoding; } if (req.url ~ "\.(css|js)$" &&req.http.User-Agent ~ "MSIE 6") { remove req.http.Accept-Encoding; }}
Best practices
● RFC 2616!● not really for reverse proxies...
● Varnish is both a client and a server
● TTLs
Best practices
● object TTL control – headers from backend● considered in the following order:● Cache-Control: s-maxage=<relative time>● Cache-Control: max-age=<relative time>● Varnish ignores all other Cache-Control headers
(unless told otherwise in VCL)● Expires: absolute time, requires synced clocks● Expires is an HTTP/1.0 header● Varnish will try to compensate for clock skew
Best practices
● object TTL control – VCL● set obj.ttl = x; - takes precedence over headers● default_ttl configuration parameter
● Varnish sets the Age: header
● if in doubt, check varnishlog● TTL tag
Best practices
● TTL tag in varnishlog 509 TTL - 1850178309 RFC 1798 1267393695 1267393694 1267395494 0 0
XID TTL time Date Expires max-age age
242 TTL c 1416303904 VCL 86400 1267393696
XID TTL time
Best practices
● caching policy● Last-Modified / If-Modified-Since
● ETag / If-None-Match
● Vary
Best practices
● compression● Varnish leaves compression up to the backends
● gzip, deflate, none – data set * 3
● Vary: Accept-Encoding
● normalize Accept-Encoding from browsers
Best practices
● sanitize request headers● we've had requests coming in to
“http://our.com/http://another.com/.*”if (req.url ~ "^/\?http://") { set req.url = regsub(req.url, "\?http://.*", "");}
● cache hit ratio went from 92% to 94%● normalize vhostsif (req.http.host ~ "^(www.)?example.com") { set req.http.host = "example.com";}
● hit ratio and backend requests: 1% is half of 2%!
Best practices
● set Content-Length on the backends● static files: multiple backends = multiple VFS
caches● serving large objects a.k.a. My Own YouTube
● objects are fully fetched before delivery● use pipe● ranges not supported● not really suitable for serving video content
Best practices
● forced TTLs● on heavily loaded sites – force TTLs to a few
seconds on all pages (but pass secure content)
● purging● entries on the ban list accumulate – consider
forcing expiry by PURGE requests● when forcing expiry, purge all Vary versions!● include purging in your application design
Best practices
● debugging● add an X-Served-By header
● add other headers along the way
● beware of header traffic!
● X-Varnish header
Best practices
● HTTPS – use pound or perlbal● in vcl_pipe:
set bereq.http.Connection = "close";
● drain connections quickly before restarting varnishd
sub vcl_recv {if (req.http.Connection != "close") { set req.http.Connection = "close"; restart;}
}
Best practices
● graph everything, ask questions later● YMMV: what is good for the big guys from
TOP100 may not be as good for you (e.g. stevedore choice)
● test● wget --save-headers● curl -i● LWP: GET -USsed● caveat: lwp-request does: “GET http://foo/bar”
Best practices
● if everything else fails...$ gdb /usr/sbin/varnishd coreGNU gdb 6.8-debianThis GDB was configured as "x86_64-linux-gnu"...(gdb) bt
(…)
(gdb) frame 3#3 0x000000000042ef64 in mgt_cli_vlu (priv=0x7fb112813c00, p=0x7fb1128d3000 "debug.health") at mgt_cli.c:270270 xxxassert(i == strlen(p));
● don't strip Varnish binaries● compile with --enable-debugging-symbols --enable-
diagnostics
Configuration
● object hash table● Varnish 2.0: -h classic,N● N hash buckets – objects / 10● a prime number
● Varnish trunk: -h critbit● Patricia Tree
Configuration
● run-time parameters (can be set from CLI)● obj_workspace=Nbytes (dynamic in trunk) –
headers, per object overhead● sess_workspace=Nbytes – entire header and all
edits done in VCL, per thread● shm_workspace=Nbytes – for the log, per thread● shm_reclen=Nbytes – max SHM log record length● session_linger=Nms – time before a worker thread
is returned to its pool● sess_timeout=Ns – persistent session timeout
Configuration
● thread_pools=N – set to the number of CPU cores
● thread_pool_add_delay=Nms – default may be too high
● thread_pool_max and _min – a bit confusing● max – the limit for all thread pools● min – the limit for one thread pool● do not set too high
OS environment
● forget about 32-bit● malloc() better than mmap() for in-memory
cache sets● also better for larger-than-memory cache sets
on Linux (YMMV)● Varnish on virtualized guests?
● slight latency difference● can be an issue for on-line auction sites
OS environment
● Virtualization● Varnish on a standalone system
OS environment
● Virtualization● Varnish on a Xen domU with pinned vcpus
OS environment
● I/O related tuning on Linux● set vm.swappiness to 0● /var/lib/varnish/$HOSTNAME/_.vsl – the SHM log● put the SHM log on tmpfs● anticipatory elevator best on HDDs, noop on SSDs● use ext2● noatime● swap striping● iSCSI is great for logs
OS environment
● network tuning● run NTP● check if your load balancer uses keep-alive● /proc/sys/net – don't tune if you don't know what
you're doing● don't use net.ipv4.tcp_tw_reuse● tcp_tw_recycle is even worse● normally the socket waits 2 * MSL● reusing causes problems with NAT routers
New features in trunk
● upcoming release: Varnish 2.1● persistent storage (without LRU support)● URL hashing director● client hashing director● critbit by default● saint mode● obj_workspace allocated dynamically● req.* in vcl_deliver● obj.* in vcl_fetch is now beresp.*
Shopping list
● http://varnish-cache.org/wiki/PostTwoShoppingList
● ESI enhancements (304s, gzip, etc.)● compression support● streaming in pass / fetch● Content-Range support● file upload buffering● VCL cookie handling (req.cookie.foo)● custom formats in varnishlog and varnishncsa
Shopping list
● expiry randomization● “lemming effect”
Support & development
● commercial support offered by Redpill-Linpro
● community support #varnish on irc.linpro.no
● VML – Varnish Moral License● http://phk.freebsd.dk/VML/● not a support contract!● help pay for Varnish development
Thank you
Questions?
Sources
● http://varnish-cache.org/● TMECC VCL configs● Wikia VCL configs● http://kristian.blog.linpro.no/● http://ingvar.blog.linpro.no/● #varnish