Upload
mike-brittain
View
9.426
Download
0
Embed Size (px)
DESCRIPTION
This talk covers some of the tools that Etsy uses for measuring performance, how to instill a culture of performance, how Etsy tracks performance wins and regressions, and where to get started if you don't have a formalized performance team in your company.Originally presented at the Boston Web Performance Meetup on Aug 24, 2011.
Citation preview
Web Performance Culture and Tools at EtsyMike BrittainDir. of Engineering, InfrastructureEtsy
Boston Web Performance MeetupAug 24, 2011
OverviewEtsy and EngineeringMake PerformanceMatterTools and Process
MonthlyWeather Report$38 MM in sales1.9 MM items sold
http://etsy.me/weather-report-june-2011
MonthlyWeather Report990 MM page views
http://etsy.me/weather-report-june-2011
Engineering
TechnologiesLinux, Apache, MySQL, PHP, MemcacheSolr, Squid, Hadoop, Amazon S3, EC2, EMRGanglia, Cacti, Nagios, Graphite, Splunk, and some of our own...
Engineering
Teams90 Engineers3-6 engineers+ product developer and/or designer
Engineering
Continuous Deployment~40 releases per day inc. app code and config changes
1. Write code2. Code review3. Automated tests
1. Write code2. Code review3. Automated tests4. Dev ⇾ QA ⇾ Pre-Prod ⇾ Prod
1. Write code2. Code review3. Automated tests4. Dev ⇾ QA ⇾ Pre-Prod ⇾ Prod5. Monitor!6. Monitor!7. Monitor!
Engineering
Data-Driven Development45,000+ metrics50+ dashboards
MakePerformanceMatterHave a story
Business Impact
Measure“Our own years of testing have conclusively shown that when speed of a feature or product improves, usage, quite simply, goes up.”
Abundance of research from Google, Bing, AOL, Amazon, Shopzilla, etc.http://googleblog.blogspot.com/2009/12/this-week-in-search-121809.html
Business Impact
MeasureBounce rateSearch conversion
Business Impact
MeasurePurchase funnelAd impressions, page views and tracking discrepanciesSocial, engagement
Operations
Site Stability
Contention for shared resources like database, memcache, solr, or even web server processes
Measure perf for discrete piecesof your infrastructure
Morale
Happy Engineers
Getting to work
Where toStart?Focus your efforts where it makes sense for your business
Getting to work
Where toOptimize?Tiers of serviceSLAs
Getting to work
FocusBeware of CTS:Constant Tweaking Syndrome
Getting to work
FocusFriends don’t let friendstweak without graphing
MakePerformanceMatterHave your story
Tools andProcess atEtsy
Process
Etsy’s Perf TeamStandardize patternsCreate tools and reportsCoordinate efforts
Process
Server-sidePerformanceThere is nothing worse than a blank page or a spinningthing-a-ma-jigger
Process
Server-sidePerformance95th Percentile> 800 ms is a P2 bug
Process
Server-sidePerformance
Tools
LoggingPerf-related data belongs in your server logsapache_note()
Tools
Logging$timer_start = microtime(true); ...$timer_diff = microtime(true) - $timer_start;
register_shutdown_function()
apache_note('php_microsec', $timer_diff);
LogFormat "%{True-Client-IP}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %{etsy_user_id}n %{php_bytes}n %{php_microsec}n %D" combined
Tools
Logging
LogFormat "%{True-Client-IP}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %{etsy_user_id}n %{php_bytes}n %{php_microsec}n %D" combined
Tools
Logging
LogFormat "%{True-Client-IP}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %{etsy_user_id}n %{php_bytes}n %{php_microsec}n %D" combined
Tools
Logging
web0060 66.249.71.110 - - [24/Aug/2011:04:16:52 +0000] "GET /listing/12189259/tropical-etched-pair-of-lampwork-glass HTTP/1.1" 200 11034 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" - 13399576 505780 554876
Tools
Logging
Tools
Analyzegrep ... access.log | awk ...grep "/listing/" access.log | \awk '{sum=sum+$(NF-1)} END {print sum/NR}'
grep "/listing/" access.log | \awk 'BEGIN {max=0} {if ($(NF-1)>max) max=$(NF-1)} END {print max}'
Tools
Analyze
Tools
Analyze1. Capture perf data in logs page gen., Boomerang, Gomez2. Aggregate Splunk, Logster, StatsD3. Record in Graphite
http://graphite.wikidot.com
Tools
Analyze
Tools
Logster1. Read new log entries every minute2. Parse and aggregate useful numbers3. Send to Graphite http://github.com/etsy
web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Help me, Rhonda.web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Heeeeeeellllllllllllllppppp!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web1101 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0201 [04:28:54 2011] [error] [client 10.101.x.x] You've been eaten by a grue.web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0201 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0034 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web1101 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0201 [04:28:54 2011] [error] [client 10.101.x.x] You've been eaten by a grue.web0055 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!!web0002 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is falling.web0089 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling.web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0034 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0087 [04:28:54 2011] [fatal] [client 10.101.x.x] Sky is falling.web0002 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0201 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0077 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0355 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0052 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0089 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling.web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0034 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!
Fatals Errors Warnings
:)
Tools
StatsDCollects metrics from your app code and sticks them in GraphiteStatsD::increment("logins.success");StatsD::timing("gearman.time", $msec);
Tools
StatsD90th pct
average
lower
:) :)
Tools
Perf Dashboard
Tools
I/O ProfilerLightweight, inline profilingStart and end times wrapped around service calls databases, memcache, apc, etc.
Tools
I/O Profiler
Tools
I/O Profiler
Tools
Client-sideTestingGomez (API), Boomerang, WebPagetest
Tracks front-end best practices
Tools
ShowSlow
Tools
ShowSlowInternal instanceAutomated testingEnd-point for I/O ProfilerTrending on individual rules http://showslow.com
Other thoughts
YSlow &Page SpeedMind your development process3rd-party content and ga.js“Use a CDN”
Other thoughts
Device-SpecificDesignThe mobile web is very muchabout designing for performanceScreen size, pixel density, connection speed, wi-fi vs. cellular, browser cache size, local storage, connections per host, metered pricing, etc.
MakePerformanceMatterHave your storyUse tools and processto focus your efforts
We are Hiringhttp://etsy.com/careersSoftware Engineering positions available in a number of teams, including Analytics, Operations, Web Performance, Payments, Core Platform, Front-End, Internal Apps, Search, Security, and more...
ThankYouMike [email protected]@mikebrittainCodeAsCraft.etsy.comgithub.com/etsygraphite.wikidot.comshowslow.com