46
Debugging Ruby Aman Gupta troubleshooting the VM and your app

Debugging Ruby

Embed Size (px)

DESCRIPTION

Debugging Ruby: Understanding and Troubleshooting the VM and your Application

Citation preview

Page 1: Debugging Ruby

Debugging RubyAman Gupta

troubleshooting the VM and your app

Page 2: Debugging Ruby

About Aman GuptaSan Francisco, CA

Ruby Performance Consulting(Twitter, Github, Heroku, ZenDesk)

Ruby Hero 2009

EventMachine, amqp, REE, sinbook, perftools.rb, gdb.rb

github.com/tmm1

@tmm1

Page 3: Debugging Ruby

The Ruby VMMRI- Ruby 1.8.{6,7}

REE- Ruby 1.8.7 + patches

Tested primarily on:

Linux operating system

Intel CPUs (i686 and x86_64)

Tools for C code

Page 4: Debugging Ruby

list open fileslsof

lsof -nPp <pid>

Page 5: Debugging Ruby

lsof -nPp <pid>-nInhibits the conversion of network numbers to host names for network files.

-PInhibits the conversion of port numbers to port names for network files

FD TYPE SIZE NODE NAMEcwd DIR 4096 10437557 /var/www/myapprtd DIR 4096 2 /txt REG 1925210 1089684 /usr/bin/rubymem REG 48188 7061523 /json-1.1.9/ext/json/ext/generator.somem REG 46970 7061524 /json-1.1.9/ext/json/ext/parser.somem REG 3428025 1131339 /memcached-0.17.4/lib/rlibmemcached.somem REG 152972 5303443 /mysql-2.8.1/lib/mysql_api.somem REG 154013 1089708 /usr/lib/libtcmalloc_minimal.so.0.0.0mem REG 119288 11616294 /lib/ld-2.7.so 0u CHR 303 /dev/null 1w REG 4746405529 1106245 /usr/local/nginx/logs/error.log 2w REG 4746405529 1106245 /usr/local/nginx/logs/error.log 3u IPv4 TCP 10.8.85.66:33326->10.8.85.68:3306 (ESTABLISHED) 10u IPv4 TCP 10.8.85.66:33327->10.8.85.68:3306 (ESTABLISHED) 11u IPv4 TCP 127.0.0.1:58273->127.0.0.1:11211 (ESTABLISHED) 12u REG 20182 12463107 /tmp/RackMultipart.28957.0 33u IPv4 TCP 174.36.83.42:37466->69.63.180.21:80 (ESTABLISHED)

jsonmemcached

mysqlhttp

Page 6: Debugging Ruby

trace system calls and signalsstrace

strace -ttTp <pid> -o <file>

strace -cp <pid>

Page 7: Debugging Ruby

strace -cp <pid>-cCount time, calls, and errors for each system call and report a summary on program exit.

-p pidAttach to the process with the process ID pid and begin tracing.

% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ---------------- 50.39 0.000064 0 1197 592 read 34.65 0.000044 0 609 writev 14.96 0.000019 0 1226 epoll_ctl 0.00 0.000000 0 4 close 0.00 0.000000 0 1 select 0.00 0.000000 0 4 socket 0.00 0.000000 0 4 4 connect 0.00 0.000000 0 1057 epoll_wait------ ----------- ----------- --------- --------- ----------------100.00 0.000127 4134 596 total

Page 8: Debugging Ruby

strace -ttTp <pid> -o <file>-tPrefix each line of the trace with the time of day.

-ttIf given twice, the time printed will include the microseconds.

-TShow the time spent in system calls. This records the time difference between the beginning and the end of each system call.

-o filenameWrite the trace output to the file filename rather than to stderr.

01:09:11.266949 epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>01:09:11.300102 accept(10, {sa_family=AF_INET, sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>01:09:11.300190 fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>01:09:11.300237 fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>01:09:11.300277 setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>01:09:11.300489 accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>01:09:11.300547 epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>01:09:11.300593 epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>01:09:11.300633 read(22, "GET / HTTP/1.1\r"..., 16384) = 772 <0.000012>01:09:11.301727 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>01:09:11.302095 poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>01:09:11.302144 write(5, "1\0\0\0\0\0\0-\0\0\0\3SELECT * FROM `table`"..., 56) = 56 <0.000023>01:09:11.302221 read(5, "\25\1\0\1,\2\0x\234m"..., 16384) = 284 <1.300897>

Page 9: Debugging Ruby

stracing ruby: SIGVTALRM15:45:51.658164 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.658244 rt_sigreturn(0x1a) = 2207807 <0.000009>15:45:51.678208 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.678271 rt_sigreturn(0x1a) = 0 <0.000009>15:45:51.698161 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.698216 rt_sigreturn(0x1a) = 140734552062624 <0.000009>15:45:51.718154 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.718192 rt_sigreturn(0x1a) = 140734552066688 <0.000009>15:45:51.738185 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.738221 rt_sigreturn(0x1a) = 11333952 <0.000008>15:45:51.758162 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.758216 rt_sigreturn(0x1a) = 0 <0.000009>15:45:51.818168 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---15:45:51.819817 rt_sigreturn(0x1a) = 1 <0.000010>15:45:51.838196 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---

ruby 1.8 uses signals to schedule its green threads

process receives a SIGVTALRM signal every 10ms

Page 10: Debugging Ruby

stracing ruby: sigprocmask

debian/redhat compile ruby with --enable-pthread

uses a native thread timer for SIGVTALRM

causes excessive calls to sigprocmask: 30% slowdown!

% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------100.00 0.326334 0 3568567 rt_sigprocmask 0.00 0.000000 0 9 read 0.00 0.000000 0 10 open 0.00 0.000000 0 10 close 0.00 0.000000 0 9 fstat 0.00 0.000000 0 25 mmap------ ----------- ----------- --------- --------- ----------------100.00 0.326334 3568685 0 total

Page 11: Debugging Ruby

dump traffic on a networktcpdump

tcpdump -i eth1 -s 1500 -nqA tcp dst port 80

tcpdump -i eth0 -s 1500 -nqA tcp dst port 3306

tcpdump -i eth1 -s 1500 -w <file> tcp dst port 80

Page 12: Debugging Ruby

tcpdump -i <eth> -s <len> -nqA <expr>

-i <eth>Network interface.

-s <len>Snarf len bytes of data from each packet.

-nDon't convert addresses (host addresses, port numbers) to names.

-qQuiet output. Print less protocol information.

-APrint each packet (minus its link level header) in ASCII.

-w <file>Write the raw packets to file rather than printing them out.

<expr>libpcap expression, for example: tcp src port 80 tcp dst port 3306

tcpdump -i <eth> -w <file> <expr>

Page 13: Debugging Ruby

tcp dst port 330619:51:06.501632 IP 10.8.85.66.50443 > 10.8.85.68.3306: tcp 98E..."K@[email protected]..[......W....SELECT * FROM `votes` WHERE (`poll_id` = 72621) LIMIT 1

tcp dst port 8019:52:20.216294 IP 24.203.197.27.40105 > 174.37.48.236.80: tcp 438E...*[email protected].%&.....%0....POx..%s.oP.......GET /poll_images/cld99erh0/logo.png HTTP/1.1Accept: */*Referer: http://apps.facebook.com/realpolls/?_fb_q=1

tcpdump -w <file>

Page 14: Debugging Ruby

Google’s CPU profilergoogle-perftools

CPUPROFILE=/tmp/myprof ./myapp

export LD_PRELOAD=libprofiler.so

export DYLD_INSERT_LIBRARIES=libprofiler.dylib

pprof ./myapp /tmp/myprof

Page 15: Debugging Ruby

wget http://google-perftools.googlecode.com/files/google-perftools-1.4.tar.gztar zxvf google-perftools-1.4.tar.gzcd google-perftools-1.4

./configure --prefix=/optmakesudo make install

# for linuxexport LD_PRELOAD=/opt/lib/libprofiler.so

# for osxexport DYLD_INSERT_LIBRARIES=/opt/lib/libprofiler.dylib

CPUPROFILE=/tmp/ruby.prof ruby -e' 5_000_000.times{ "hello world" }'

pprof `which ruby` --text /tmp/ruby.prof

download

compile

profile

report

setup

Page 16: Debugging Ruby

Total: 103 samples 20 19.4% 19.4% 95 92.2% rb_yield_0 11 10.7% 30.1% 103 100.0% rb_eval 8 7.8% 37.9% 12 11.7% gc_sweep 3 2.9% 68.9% 52 50.5% rb_str_new3 3 2.9% 74.8% 3 2.9% obj_free 3 2.9% 77.7% 103 100.0% int_dotimes 3 2.9% 80.6% 12 11.7% gc_mark

pprof ruby ruby.prof --text

pprof ruby ruby.prof --gif

Page 17: Debugging Ruby

Profiling MRI10% of production VM time spent in rb_str_sub_bang

String#sub!

called from Time.parse

return unless str.sub!(/\A(\d{1,2})/, '')return unless str.sub!(/\A( \d|\d{1,2})/, '')return unless str.sub!(/\A( \d|\d{1,2})/, '')return unless str.sub!(/\A(\d{1,3})/, '')return unless str.sub!(/\A(\d{1,2})/, '')return unless str.sub!(/\A(\d{1,2})/, '')

switch to third_base gem

ThirdBase: Fast and Easy Date/DateTime class for Ruby

Page 18: Debugging Ruby

Profiling EM + threads

known issue: EM+threads = slow

memcpy??

thread context switches copy the stack w/ memcpy

EM allocates huge buffer on the stack

solution: move buffer to the heap

Total: 3763 samples 2764 73.5% catch_timer 989 26.3% memcpy 3 0.1% st_lookup 2 0.1% rb_thread_schedule 1 0.0% rb_eval 1 0.0% rb_newobj 1 0.0% rb_gc_force_recycle

Page 19: Debugging Ruby

trace library callsltrace

ltrace -ttTp <pid> -o <file>

ltrace -cp <pid>

Page 20: Debugging Ruby

% time seconds usecs/call calls function------ ----------- ----------- --------- -------------------- 48.65 11.741295 617 19009 memcpy 30.16 7.279634 831 8751 longjmp 9.78 2.359889 135 17357 _setjmp 8.91 2.150565 285 7540 malloc 1.10 0.265946 20 13021 memset 0.81 0.195272 19 10105 __ctype_b_loc 0.35 0.084575 19 4361 strcmp 0.19 0.046163 19 2377 strlen 0.03 0.006272 23 265 realloc------ ----------- ----------- --------- --------------------100.00 24.134999 82999 total

ltrace -c ruby threaded_em.rb

01:24:48.769408 --- SIGVTALRM (Virtual timer expired) ---01:24:48.769616 memcpy(0x1216000, "", 1086328) = 0x1216000 <0.000578>01:24:48.770555 memcpy(0x6e32670, "\240&\343v", 1086328) = 0x6e32670 <0.000418>

01:24:49.899414 --- SIGVTALRM (Virtual timer expired) ---01:24:49.899490 memcpy(0x1320000, "", 1082584) = 0x1320000 <0.000628>01:24:49.900474 memcpy(0x6e32670, "", 1086328) = 0x6e32670 <0.000479>

ltrace -ttT -e memcpy ruby threaded_em.rb

Page 21: Debugging Ruby

trace dlopen’d library callsltrace/libdl

http://github.com/ice799/ltrace/tree/libdl

ltrace -F <conf> -bg -x <symbol> -p <pid>

Page 22: Debugging Ruby

ltrace -F <conf> -b -g -x <sym>-bIgnore signals.

-gIgnore libraries linked at compile time.

-F <conf>Read prototypes from config file.

-x <sym>Trace calls to the function sym.

-s <num>Show first num bytes of string args.

-F ltrace.confint mysql_real_query(addr,string,ulong); void garbage_collect(void);int memcached_set(addr,string,ulong,string,ulong);

Page 23: Debugging Ruby

ltrace -x garbage_collect19:08:06.436926 garbage_collect() = <void> <0.221679>19:08:15.329311 garbage_collect() = <void> <0.187546>19:08:17.662149 garbage_collect() = <void> <0.199200>19:08:20.486655 garbage_collect() = <void> <0.205864>19:08:25.102302 garbage_collect() = <void> <0.214295>19:08:35.552337 garbage_collect() = <void> <0.189172>

ltrace -x mysql_real_query19:09:11.493395 mysql_real_query(0x19c7a500, "SELECT * FROM `users`", 21) = 0 <1.206506>19:09:16.630981 mysql_real_query(0x1c9e0500, "SET NAMES 'UTF8'", 16) = 0 <0.000324>19:09:16.631446 mysql_real_query(0x1c9e0500, "SET SQL_AUTO_IS_NULL=0", 22) = 0 <0.000322>19:09:16.654231 mysql_real_query(0x1c9e0500, "COMMIT", 6) = 0 <0.000181>

ltrace -x memcached_setmemcached_set(0x15d46b80, "Status:5456561633", 21, "\004\bo:\01{", 366) = 0 <0.001116>memcached_set(0x15d46b80, "Status:5453277696", 21, "\004\bo:\01{", 333) = 0 <0.000224>memcached_set(0x15d46b80, "Status:5435377757", 21, "\004\bo:\01{", 298) = 0 <0.001850>memcached_set(0x15d46b80, "Status:5435122010", 21, "\004\bo:\01{", 302) = 0 <0.000530>memcached_set(0x15d46b80, "Status:5407037167", 21, "\004\bo:\01{", 318) = 0 <0.000291>memcached_set(0x15d46b80, "Status:5405690802", 21, "\004\bo:\01{", 299) = 0 <0.000658>memcached_set(0x15d46b80, "Status:5343957534", 21, "\004\bo:\01{", 264) = 0 <0.000243>

Page 24: Debugging Ruby

the GNU debuggergdb

gdb <program> <pid>gdb <program>

Be sure to build with:-ggdb-O0

Page 25: Debugging Ruby

gdb walkthrough% gdb ./test-it (gdb) b averageBreakpoint 1 at 0x1f8e: file test-it.c, line 3.

(gdb) runStarting program: /Users/joe/test-it Reading symbols for shared libraries ++. doneBreakpoint 1, average (x=5, y=6) at test-it.c:33 int sum = x + y;

(gdb) bt#0 average (x=5, y=6) at test-it.c:3#1 0x00001fec in main () at test-it.c:12

(gdb) s4 double avg = sum / 2.0;(gdb) s5 return avg;

(gdb) p avg$1 = 5.5

(gdb) p sum$2 = 11

start gdbset breakpoint on function named average

run program

hit breakpoint!

show backtrace

function stack

single step

print variables

Page 26: Debugging Ruby

(gdb) where#0 0x0002a55e in rb_call (klass=1386800, recv=5056455, mid=42, argc=1, argv=0xbfffe5c0, scope=0, self=1403220) at eval.c:6125#1 0x000226ef in rb_eval (self=1403220, n=0x1461e4) at eval.c:3493#2 0x00026d01 in rb_yield_0 (val=5056455, self=1403220, klass=0, flags=0, avalue=0) at eval.c:5083#3 0x000270e8 in rb_yield (val=5056455) at eval.c:5168#4 0x0005c30c in int_dotimes (num=1000000001) at numeric.c:2946#5 0x00029be3 in call_cfunc (func=0x5c2a0 <int_dotimes>, recv=1000000001, len=0, argc=0, argv=0x0) at eval.c:5759#6 0x00028fd4 in rb_call0 (klass=1387580, recv=1000000001, id=5785, oid=5785, argc=0, argv=0x0, body=0x152b24, flags=0) at eval.c:5911#7 0x0002a7a7 in rb_call (klass=1387580, recv=1000000001, mid=5785, argc=0, argv=0x0, scope=0, self=1403220) at eval.c:6158#8 0x000226ef in rb_eval (self=1403220, n=0x146284) at eval.c:3493#9 0x000213e3 in rb_eval (self=1403220, n=0x1461a8) at eval.c:3223#10 0x0001ceea in eval_node (self=1403220, node=0x1461a8) at eval.c:1437#11 0x0001d60f in ruby_exec_internal () at eval.c:1642#12 0x0001d660 in ruby_exec () at eval.c:1662#13 0x0001d68e in ruby_run () at eval.c:1672#14 0x000023dc in main (argc=2, argv=0xbffff7c4, envp=0xbffff7d0) at main.c:48

Ruby VM stack traces

rb_eval recursively executes ruby code in 1.8

Page 27: Debugging Ruby

#include "ruby.h"

VALUEsegv(){ VALUE array[1]; array[1000000] = NULL; return Qnil;}

voidInit_segv(){ rb_define_method(rb_cObject, "segv", segv, 0);}

Debugging Ruby Segfaultstest_segv.rb:4: [BUG] Segmentation faultruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9.7.0]

def test require 'segv' 4.times do Dir.chdir '/tmp' do Hash.new{ segv }[0] end endend

sleep 10test()

Page 28: Debugging Ruby

$ sudo gdb ruby 23611Attaching to program: ruby, process 236110x00007fa5113c0c93 in nanosleep () from /lib/libc.so.6(gdb) cContinuing.

Program received signal SIGBUS, Bus error.segv () at segv.c:77 array[1000000] = NULL;

1. Attach to running process$ ps aux | grep rubyjoe 23611 0.0 0.1 25424 7540 S Dec01 0:00 ruby test_segv.rb

2. Use a coredump

$ sudo mkdir /cores$ sudo chmod 777 /cores$ sudo sysctl kernel.core_pattern=/cores/%e.core.%s.%p.%t

Process.setrlimit Process::RLIMIT_CORE, 300*1024*1024

$ sudo gdb ruby /cores/ruby.core.6.23611.1259781224

Page 29: Debugging Ruby

def test require 'segv' 4.times do Dir.chdir '/tmp' do Hash.new{ segv }[0] end endend

test()

(gdb) where#0 segv () at segv.c:7#1 0x000000000041f2be in call_cfunc () at eval.c:5727...#13 0x000000000043ba8c in rb_hash_default () at hash.c:521...#19 0x000000000043b92a in rb_hash_aref () at hash.c:429...#26 0x00000000004bb7bc in chdir_yield () at dir.c:728#27 0x000000000041d8d7 in rb_ensure () at eval.c:5528#28 0x00000000004bb93a in dir_s_chdir () at dir.c:816...#35 0x000000000041c444 in rb_yield () at eval.c:5142#36 0x0000000000450690 in int_dotimes () at numeric.c:2834...#48 0x0000000000412a90 in ruby_run () at eval.c:1678#49 0x000000000041014e in main () at main.c:48

Page 30: Debugging Ruby

Enough C!

What about Ruby?

Page 31: Debugging Ruby

google-perftools for rubyperftools.rb

CPUPROFILE=/tmp/myprof ruby myapp.rb

export RUBYOPT=”-r`gem which perftools | tail -1`”

pprof.rb /tmp/myprof

gem install perftools.rb

http://github.com/tmm1/perftools.rb

Page 32: Debugging Ruby

require 'sinatra'

get '/sleep' do sleep 0.25 'done'end

get '/compute' do proc{ |n| a,b=0,1 n.times{ a,b = b,a+b } b }.call(10_000) 'done'end

$ ab -c 1 -n 50 http://127.0.0.1:4567/compute$ ab -c 1 -n 50 http://127.0.0.1:4567/sleep

== Sinatra has ended his set (crowd applauds)PROFILE: interrupts/evictions/bytes = 232/0/2152

Total: 232 samples 83 35.8% 35.8% 118 50.9% Sinatra::Application#GET /compute 56 24.1% 59.9% 56 24.1% garbage_collector 35 15.1% 75.0% 113 48.7% Integer#times

Sampling profiler:

232 samples total

83 samples were in /compute

118 samples had /compute on the stack but were in another function

/compute accounts for 50% of process, but only 35% of time was in /compute itself

Page 33: Debugging Ruby

CPUPROFILE=app.profCPUPROFILE_REALTIME=1CPUPROFILE=app-rt.prof

Page 34: Debugging Ruby

redis-rb bottleneck

Page 35: Debugging Ruby

why is rubygems slow?

Page 36: Debugging Ruby

gdb with MRI hooksgdb.rb

gdb.rb <pid>

http://github.com/tmm1/gdb.rb

Page 37: Debugging Ruby

def test require 'segv' 4.times do Dir.chdir '/tmp' do Hash.new{ segv }[0] end endend

test()(gdb) ruby threads

0xa3e000 main curr thread THREAD_RUNNABLE WAIT_NONE node_vcall segv in test_segv.rb:5 node_call test in test_segv.rb:5 node_call call in test_segv.rb:5 node_call default in test_segv.rb:5 node_call [] in test_segv.rb:5 node_call test in test_segv.rb:4 node_call chdir in test_segv.rb:4 node_call test in test_segv.rb:3 node_call times in test_segv.rb:3 node_vcall test in test_segv.rb:9

Page 38: Debugging Ruby

(gdb) ruby threads list0x15890 main thread THREAD_STOPPED WAIT_JOIN(0x19ef400) 4417 bytes0x19ef4 thread THREAD_STOPPED WAIT_TIME(57.10) 6267 bytes0x19e34 thread THREAD_STOPPED WAIT_FD(5) 10405 bytes0x19dc4 thread THREAD_STOPPED WAIT_NONE 14237 bytes0x19dc8 thread THREAD_STOPPED WAIT_NONE 14237 bytes0x19dcc thread THREAD_STOPPED WAIT_NONE 14237 bytes0x22668 thread THREAD_STOPPED WAIT_NONE 14237 bytes0x1d630 curr thread THREAD_RUNNABLE WAIT_NONE

(gdb) ruby eval 1+23(gdb) ruby eval Thread.current#<Thread:0x1d630 run>(gdb) ruby eval Thread.list.size8

(gdb) ruby objects HEAPS 8 SLOTS 1686252 LIVE 893327 (52.98%) FREE 792925 (47.02%)

scope 1641 (0.18%) regexp 2255 (0.25%) data 3539 (0.40%) class 3680 (0.41%) hash 6196 (0.69%) object 8785 (0.98%) array 13850 (1.55%) string 105350 (11.79%) node 742346 (83.10%)

(gdb) ruby objects strings 140 u'lib' 158 u'0' 294 u'\n' 619 u''

30503 unique strings 3187435 bytes

Page 39: Debugging Ruby

rails_warden leak(gdb) ruby objects classes 1197 MIME::Type 2657 NewRelic::MetricSpec 2719 TZInfo::TimezoneTransitionInfo 4124 Warden::Manager 4124 MethodOverrideForAll 4124 AccountMiddleware 4124 Rack::Cookies 4125 ActiveRecord::ConnectionAdapters::ConnectionManagement 4125 ActionController::Session::CookieStore 4125 ActionController::Failsafe 4125 ActionController::ParamsParser 4125 Rack::Lock 4125 ActionController::Dispatcher 4125 ActiveRecord::QueryCache 4125 ActiveSupport::MessageVerifier 4125 Rack::Head

middleware chain leaking per request

Page 40: Debugging Ruby

mongrel sleeper thread 0x16814c00 thread THREAD_STOPPED WAIT_TIME(0.47) 1522 bytes node_fcall sleep in lib/mongrel/configurator.rb:285 node_fcall run in lib/mongrel/configurator.rb:285 node_fcall loop in lib/mongrel/configurator.rb:285 node_call run in lib/mongrel/configurator.rb:285 node_call initialize in lib/mongrel/configurator.rb:285 node_call new in lib/mongrel/configurator.rb:285 node_call run in bin/mongrel_rails:128 node_call run in lib/mongrel/command.rb:212 node_call run in bin/mongrel_rails:281 node_fcall (unknown) in bin/mongrel_rails:19

def run @listeners.each {|name,s| s.run }

$mongrel_sleeper_thread = Thread.new { loop { sleep 1 } }end

Page 41: Debugging Ruby

god memory leaks(gdb) ruby objects arrays elements instances 94310 3 94311 3 94314 2 94316 1 5369 arrays 2863364 member elements

many arrays with 90k+ elements!

5 separate god leaks fixed by Eric Lindvall with the help of gdb.rb!

43 God::Process 43 God::Watch 43 God::Driver 43 God::DriverEventQueue 43 God::Conditions::MemoryUsage 43 God::Conditions::ProcessRunning 43 God::Behaviors::CleanPidFile 45 Process::Status 86 God::Metric327 God::System::SlashProcPoller327 God::System::Process406 God::DriverEvent

Page 42: Debugging Ruby

ruby method cache

(gdb) b rb_clear_cacheBreakpoint 1 at 0x41067b: file eval.c, line 351.(gdb) cContinuing.

Breakpoint 1, rb_clear_cache () at eval.c:351351 if (!ruby_running) return;(gdb) ruby threads

0x1623000 main curr thread THREAD_RUNNABLE WAIT_NONE node_call extend_object in sin.rb:23 node_call extend in sin.rb:23 node_call GET /other in lib/sinatra/base.rb:779 node_call GET /other in lib/sinatra/base.rb:779 node_call call in lib/sinatra/base.rb:779 node_fcall route in lib/sinatra/base.rb:474

(gdb) ruby methodcache UnboundMethod#arity Hash#[]= Class#private Array#freeze Module#Integer Fixnum#< Class#is_a? Fixnum#+ Class#protected Class#>=

2028 empty slots (99.02%)

Module#extend wipes the entire method cache!

Page 43: Debugging Ruby

ruby memory leak detectorbleak_house

ruby-bleak-house myapp.rb

export RUBYOPT=”-r`gem which bleak_house | tail -1`”

bleak /tmp/bleak.5979.000.dump

gem install bleak_house

http://github.com/fauna/bleak_house

Page 44: Debugging Ruby

191691 total objectsFinal heap size 191691 filled, 220961 freeDisplaying top 20 most common line/class pairs89513 __null__:__null__:__node__41438 __null__:__null__:String2348 lib/ruby/site_ruby/1.8/rubygems/specification.rb:557:Array1508 lib/ruby/gems/1.8/specifications/gettext-1.9.gemspec:14:String1021 lib/ruby/gems/1.8/specifications/heel-0.2.0.gemspec:14:String 951 lib/ruby/site_ruby/1.8/rubygems/version.rb:111:String 935 lib/ruby/site_ruby/1.8/rubygems/specification.rb:557:String 834 lib/ruby/site_ruby/1.8/rubygems/version.rb:146:Array

BleakHouse

installs a patched version of ruby: ruby-bleak-house

unlike gdb.rb, see where objects were created (file:line)

create multiple dumps over time with `kill -USR2 <pid>` and compare to find leaks

Page 45: Debugging Ruby

Coming soon:bleak_house++

no patches or need to recompile ruby“assembly metaprogramming” to setup a trampoline on rb_new_objhttp://timetobleed.com/rewrite-your-ruby-vm-at-runtime-to-hot-patch-useful-features/

memory profilercoredump a production ruby processload the core to generate profiles of memory usage and leaks

Page 46: Debugging Ruby

Questions?

@joedamato

timetobleed.com

@tmm1

github.com/tmm1

Thanks for listening!