Upload
nasirj
View
553
Download
0
Tags:
Embed Size (px)
Citation preview
Profiling RubyWhere and what to optimize in your code
/
Nasir Jamal work@HouseTrip
@_nasj
profiling helps you to narrow down towhere optimization would be most useful
benchmarking allows you to easily isolateoptimizations and cross-compare them
Benchmarking
Realtime
puts Benchmark.realtime { 4000.times { |x| x**x } }
#=> 0.905972957611084
1. User CPU time2. System CPU time3. (1 + 2) i.e. User + System CPU time4. Realtime
puts Benchmark.measure { 4000.times { |x| x**x } }
# => 0.890000 0.020000 0.910000 (0.909118)
Benchmark.bm do |bm| bm.report { first_algorithm } bm.report { second_algorithm} …..end
#=> user system total real 0.940000 0.010000 0.950000 ( 0.956572) 0.430000 0.010000 0.440000 ( 0.423467)
Benchmark.bm(14) do |bm| bm.report(“first header”) { first_algorithm } bm.report(“second header”) { second_algorithm} …..end
#=> user system total realfirst header 0.940000 0.010000 0.950000 (0.956572)second header 0.430000 0.010000 0.440000 (0.423467)
examples ...
Benchmark.bmbm(20) do |bm| bm.report('append') do str1, str2, str3, str4 = 'string1', 'string2', 'string3', 'string4' 1_000_000.times { y = str1 << str2 << str3 << str4 } end
bm.report('concat') do str1, str2, str3, str4 = 'string1', 'string2', 'string3', 'string4' 1_000_000.times { y = str1 + str2 + str3 + str4 } end
bm.report('interpolate') do str1, str2, str3, str4 = 'string1', 'string2', 'string3', 'string4' 1_000_000.times { y = "#{str1}#{str2}#{str3}#{str4}" } end
bm.report('interpolate one') do str1, str2, str3, str4 = 'string1', 'string2', 'string3', 'string4' 1_000_000.times { y = "string1string2string3#{str4}" } endend
.bmbm prevents result skewing
bmbm does rehearsal which includes any initialisation and GC runthen it does the real benchmark
#=> Rehearsal ---------------------------------------------append 0.280000 0.000000 0.280000 ( 0.294505)concat 0.470000 0.020000 0.490000 ( 0.481748)interpolate 0.430000 0.010000 0.440000 ( 0.433404)interpolate one 0.320000 0.000000 0.320000 ( 0.323479)--------------------------------------- total: 1.530000sec
| Tests | user | system | total | real ||:---------------|:--------:|:--------:|:---------:|-----------:||append | 0.260000 | 0.010000 | 0.270000 | (0.265732) ||concat | 0.400000 | 0.010000 | 0.410000 | (0.396115) ||interpolate | 0.400000 | 0.000000 | 0.400000 | (0.408096) ||interpolate one | 0.280000 | 0.010000 | 0.290000 | (0.286443) |
Benchmark.bmbm(20) do |bm| bm.report('gsub') do 1_0_000.times { Date.today.to_s.gsub!('-','') } end bm.report('strftime') do 1_0_000.times { Date.today.strftime("%Y%m%d") } endend
#=> Rehearsal -------------------------------------------gsub 0.750000 0.000000 0.750000 ( 0.751547)strftime 1.320000 0.000000 1.320000 ( 1.320621)------------------------------------ total: 2.070000sec
| | user | system | total | real ||:--------|:--------:|:---------:|:--------:|:-----------||gsub | 0.710000 | 0.000000 | 0.710000 | (0.709918) ||strftime | 1.320000 | 0.000000 | 1.320000 | (1.315345) |
module Extendable def name @name endend
class Person attr_accessor :nameend
require 'ostruct'Benchmark.bmbm(20) do |bm| bm.report('Class') do 1_00_000.times { p = Person.new; p.name='Joe'; p.name } end bm.report('Extends') do 1_00_000.times { p = Person1.new; p.extend Extendable; p.name='Joe'; p.name } end bm.report('Struct') do 1_00_000.times { person2 = Struct.new(:name); p = person2.new('Joe'); p.name } end bm.report('OpenStruct') do 1_00_000.times { p = OpenStruct.new(:name => 'Joe'); p.name } endend
#=> Rehearsal ---------------------------------------------Class 0.080000 0.000000 0.080000 (0.086261)Extends 0.410000 0.000000 0.410000 (0.407723)Struct 1.490000 0.000000 1.490000 (1.490557)OpenStruct 1.980000 0.010000 1.990000 (1.990507)------------------------------------ total: 3.970000sec
| | user | system | total | real ||:----------|:--------:|:--------:|:---------:|:-----------||Class | 0.080000 | 0.000000 | 0.080000 | (0.082448) ||Extends | 0.400000 | 0.000000 | 0.400000 | (0.410884) ||Struct | 1.480000 | 0.000000 | 1.480000 | (1.490531) ||OpenStruct | 1.960000 | 0.010000 | 1.970000 | (1.965923) |
Profiling
perftools.rban adaptation of Google's perftools library to the Ruby land by Aman
Guptahttps://github.com/tmm1/perftools.rb
$gem install perftools.rb
does profiling via sampling method, where by default it takes100 samples a second
examples ...
to see results
Interpreting the above columns:
1. Number of profiling samples in this function2. Percentage of profiling samples in this function3. Percentage of profiling samples in the functions printed so far4. Number of profiling samples in this function and its callees5. Percentage of profiling samples in this function and its callees6. Function name
a = ''PerfTools::CpuProfiler.start("/tmp/profiling/string_concat") do 100_000.times {|x| a += x.to_s}end
$pprof.rb --text --ignore=Gem /tmp/profiling/string_concatTotal: 2939 samples 1497 50.9% 50.9% 1501 51.1% Object#irb_binding 1438 48.9% 99.9% 1438 48.9% garbage_collector 4 0.1% 100.0% 1500 51.0% Integer#times
to see results as graph
bigger the box, the more time spent there
1. Class Name2. Method Name3. local (percentage)4. of cumulative (percentage)
brew install graphviz$pprof.rb --gif --ignore=Gem /tmp/profiling/string_concat > /tmp/profiling/string_concat.gif
slightly hairy method
And on on ....
PerfTools::CpuProfiler.start("/tmp/profiling/property_search") do 100.times { PropertySearch.new.search }end
$pprof.rb --text --ignore=Gem /tmp/profiling/property_search
Total: 6799 samples 2598 38.2% 38.2% 2598 38.2% garbage_collector 1761 25.9% 64.1% 3966 58.3% PropertySearch#set_price_filter_counts 390 5.7% 69.8% 1632 24.0% Object#detect 389 5.7% 75.6% 503 7.4% PropertySearch::PriceRange#contains? 358 5.3% 80.8% 358 5.3% Mysql2::Result#each 263 3.9% 84.7% 768 11.3% Object#select_values 199 2.9% 87.6% 236 3.5% ActiveRecord::ConnectionAdapters::Mysql2Adapter#execute 187 2.8% 90.4% 640 9.4% Property.collect_column 148 2.2% 92.6% 148 2.2% ActiveSupport::BufferedLogger#flush 114 1.7% 94.2% 114 1.7% Fixnum#<= 75 1.1% 95.3% 75 1.1% Array#join 67 1.0% 96.3% 76 1.1% Array#map 47 0.7% 97.0% 47 0.7% ActiveRecord::ConnectionAdapters::Column#type_cast 44 0.6% 97.7% 295 4.3% Array#collect 32 0.5% 98.1% 388 5.7% Object#to_a 28 0.4% 98.5% 28 0.4% PropertySearch#day_count 15 0.2% 98.8% 140 2.1% ActiveRecord::ConnectionAdapters::Mysql2Adapter#select 9 0.1% 98.9% 4199 61.8% PropertySearch#search
$pprof.rb --help
QcachegrindHow to setup?
http://langui.sh/2011/06/16/how-to-install-qcachegrind-
kcachegrind-on-mac-osx-snow-leopard/
$pprof.rb --callgrind /tmp/profiling/property_search > /tmp/profiling/property_search.callgrind
graph
To use with Rails
Valid default_printer values are pdf, text, raw, gif, callgrind
# Gemfilegem 'rack-perftools_profiler', :require => false
# config/environment.rbconfig.middleware.use ::Rack::PerftoolsProfiler, :default_printer => 'gif', :bundler => true, :mode => :cputime, :frequency => 250
profile=true will enable profilingtimes=10 will hit the page for 10 timeswill store the results in profile_ppp_page.txt
RACK_PROFILER=true script/server
curl -o profile_ppp_page.txt \"http://localhost:3000/en/rentals/107605?profile=true×=10"
ruby-prof
OR
$gem install ruby-prof
#Gemfile gem 'ruby-prof', :require => false
types of measuresRubyProf.measure_mode = RubyProf::PROCESS_TIMERubyProf.measure_mode = RubyProf::WALL_TIMERubyProf.measure_mode = RubyProf::CPU_TIMERubyProf.measure_mode = RubyProf::ALLOCATIONSRubyProf.measure_mode = RubyProf::MEMORYRubyProf.measure_mode = RubyProf::GC_RUNSRubyProf.measure_mode = RubyProf::GC_TIME
types of printersRubyProf::FlatPrinterRubyProf::FlatPrinterWithLineNumbersRubyProf::GraphPrinterRubyProf::GraphHtmlPrinterRubyProf::CallTreePrinterRubyProf::CallStackPrinterRubyProf::MultiPrinter
examples ...
GraphHtmlPrinterresult = RubyProf.profile { PropertySearch.new.search }
printer = RubyProf::GraphHtmlPrinter.new(result)
File.open("tmp/profile_data.html", 'w') { |file| printer.print(file)}
profile_data.html
CallStackPrinterresult = RubyProf.profile { PropertySearch.new.search }
printer = RubyProf::CallStackPrinter.new(result)
File.open("tmp/profile_data.html", 'w') { |file| printer.print(file)}
profile_data.html
CallTreePrinterresult = RubyProf.profile { PropertySearch.new.search }
printer = RubyProf::CallTreePrinter.new(result)
File.open("tmp/profile_data", 'w') { |file| printer.print(file)}
profile_data (qcachegrind)
In Rails2.x.x
3.x.x
Or just use rake tasks
script/performance/benchmarker 10 'Class.method_name' 'AnotherClass.method_name'
script/performance/profiler 'Class.method_name' 10 graph
script/performance/profiler 'Class.method_name' 10 graph_html 2> property.html && open property.html
rails benchmarker 'Class.method_name'
rails profiler 'Class.method_name' --runs 3 --metrics cpu_time,memory
rake test:benchmarkrake test:profile
rake test:profile TEST=test/performance/home_page_test.rb
But from Rails 4.0performance tests are no longer part of the default
stackhttps://github.com/rails/rails-perftest
Questions?
[email protected] are hiring
Thank you