Upload
millajovavich
View
202
Download
16
Tags:
Embed Size (px)
DESCRIPTION
debugging core crash dump
Citation preview
1. Debug Process and Analyze Process Core on Solaris.................................................21.1 General Commands............................................................................................................................................21.2 Analyze Process Core by Using mdb (Modular Debugger)...............................................................41.3 Analyze Process Core by Using adb on Solaris.......................................................................................41.4 Analyze Process Core by Using gdb on Solaris.......................................................................................51.5 Debug Process and Analyze Core by Using truss on Solaris............................................................71.6 Analyze Process by using dbx........................................................................................................................91.8 Create Process Core........................................................................................................................................ 111.9 Examining Memory Address Spaces with mdb on Solaris.............................................................111.10 Debug Kernel, System Calls and Processes (DTRACE)..................................................................111.11 Other Debugging Tools on Solaris..........................................................................................................16
2. Debug Process and Analyze Process Core on Linux..................................................182.1 Debug Processes by using STRACE..........................................................................................................182.2 Analyze a Process Core Dump with gdb on Linux..............................................................................192.3 Analyze the Process Core using dbx on Linux.....................................................................................212.4 Analyze a Core Dump Using Oprofile on Linux...................................................................................222.5 Debug Libraries and Symbols on Linux..................................................................................................232.6 Other Debugging Tools on Linux...............................................................................................................23
3. Debug Process and Analyze Process Core on HP-UX................................................253.1 Debug Processes by using tusc...................................................................................................................253.2 Instaling tusc on HP-UX 11.xx.....................................................................................................................283.3 Debug Processes and Core Files by using HP WDB / GDB.............................................................293.4 Debug Processes by using truss.................................................................................................................343.5 Anlalyze Process Performance by using Caliper on HP-UX 11.xx...............................................353.6 Analyze Process Performance by using Prospect on HP-UX 11.xx.............................................373.7 Live Memory Analysis on HP-UX 11.xx by using KWDB.................................................................403.8 Other Debugging Tools on HP-UX 11.xx.................................................................................................41
4. Debug Process and Analyze Process Core on IBM AIX.............................................434.1 Debug Processes by using proctools........................................................................................................434.2 Debug Processes by using trace.................................................................................................................434.3 Debug Processes by using syscalls...........................................................................................................454.4 Debug Processes by using watch...............................................................................................................464.5 Debug Processes by using ProbeVue.......................................................................................................464.6 Debug Processes by using truss.................................................................................................................474.7 Debug Processes by using dbx....................................................................................................................474.8 Analyze a Processes Core by using KDB.................................................................................................474.9 Other Debugging Tool on IBM AIX............................................................................................................47
5. Debug Process and Analyze Process Core on IRIX....................................................49
6. Debug Process and Analyze Process Core on Tru64.................................................49
7. Generate / Analyze a Crash Dump on Solaris..........................................................507.1 Save a Crash Dump on a Panic’d System................................................................................................507.2 Setup a System to Save a Crash Dump....................................................................................................517.3 Crash Dump Analysis on Solaris by using MDB..................................................................................527.4 Service Tool Bundle Service Crash Analysis Tool..............................................................................557.5 Crash Dump Analysis on Solaris by using ADB...................................................................................567.6 Crash Dump Analysis on Solaris by using Crash................................................................................587.7 Crash Dump Analysis on Solaris by using ACT....................................................................................607.8 Other Crash Dump Analysis Tools on Solaris......................................................................................60
8. Generate / Analyze a Crash Dump on HP-UX..........................................................61
8.1 Crash Dump Analysis by using KWDB....................................................................................................618.2 Remote Crash Dump Analysis.....................................................................................................................658.3 Crash Dump Analysis by using Q4............................................................................................................658.4 Crash Dump Analysis by using KWDB Q4 Mode.................................................................................688.5 Crash Dump Analysis by using HP WDB / GDB..................................................................................738.6 Crash Dump Analysis by using adb..........................................................................................................74
9. Generate / Analyze a Crash Dump on Linux............................................................759.1 Enable Saving Crash Dump by using kexex-tools..............................................................................759.2 Symulate a Panic and Save a Crash Dump.............................................................................................769.3 Analyze Crash Dump by using crash........................................................................................................779.4 Analyze Crash Dump by using GDB..........................................................................................................789.5 Analyze Crash Dump by using LKCD.......................................................................................................809.6 Other Useful Commands...............................................................................................................................84
10. Generate / Analyze a Crash Dump on Linux..........................................................8410.1 Setup and Enable KDB.................................................................................................................................8410.2 Analyze a Crash Dump by using KDB....................................................................................................85
11. Debugging Tools...................................................................................................8811.1 Informations....................................................................................................................................................89
1. Debug Process and Analyze Process Core on Solaris
1.1 General CommandsShow Process Tracebacks:pstack core
Show Process Tracebacks on Running Process:pstack process_id
Show Process Threads Info:pflags core
Show Process Memory Mapping:pmap core
Show Process Memory Mapping for a Running Process:pmap -sx `pgrep testprog`
Show Kernel Info:kstat -n system_misc
Check System Pages:kstat -n system_pages
Check Processes:prstat -Lmc 10 10 > prstat.outmore prstat.out
Debug Processes:pargs corepcred $$pldd $$psig $$pfiles $$pfile pidpstop $$prun corepwait pidptree $$ptree pidptime corepwdx $$preap* corepgrep -u rmc
Kernel Lock Statistics (Use -i 971 as Interval to Avoid Collisions with the Clock Interrupt and Gather Fine-Grained Data):lockstat -i 971 sleep 300 > lockstat.outlockstat -i 971 -I sleep 300 > lockstatI.out
Kernel Profiling:lockstat -Ikw i997 sleep 10
CPU Traps Statistics:trapstat -t
Gather CPU Hardware Counters per Process:cputrack -N 20 -c pic0=DC_access,pic1=DC_miss -p 19849bc -l
Gather CPU Statistics:cpustat -c pic0=Cycle_cnt,pic1=DTLB_miss 1
Check Page Size:pagesize -a
Set Page Size Preference:ppgsz -o heap=4M ./testprog
Segmap Hit Rates Statistics:kstat -n segmap
Dump ELF File:elfdump -e /bin/ls
Dump Section Headers:elfdump -c /bin/ls
Invoke the Runtime Linker on the Specified Binary File to Check which Libraries are Linked to it:ldd netstat
Run pled on Running Processes:
pldd $$
Get Linked List of All Processes:kstat -n varmdb -k> max_nprocs/D
Library Tracing:apptrace ls
Check Scheduling Classes:dispadmin -lpriocntl -l
Check Scheduling Class and Threads Priority:ps -eLc
Check Timeshare Dispatch Table:dispadmin -g -c TS
1.2 Analyze Process Core by Using mdb (Modular Debugger)
mdb executable_name core_name$C$q
OR:mdb core::statusdata::files::stack::walkers::dcmds -l::cpu0::print cpu_t::walk walk_name | ::dcmd::walk cpu|::print cpu_t::cpu_t::sizeof::address::list
OR:mdb –k
1.3 Analyze Process Core by Using adb on Solaris
Invoke the debugger:adb -c core
Display the message buffer:$<msgbuf
Get the thread list:$< threadlist
Check the status:$>status
Get the process crash time:$>time/Y
Get the kernel memory structures:$> kmastat
Quit the debugger:$>q
1.4 Analyze Process Core by Using gdb on Solaris
The GNU Debugger is a powerful debugger developed for the main operating systems.In the most recent Solaris versions, the GDB is shipped with the installation media.You can find the here the current release.
Start gdb on core file:gdb -c core
OR:gdb a.out core
OR:gdb path/to/the/binary path/to/the/core
OR by gdb Prompt:(gdb) core core
If the executable path is not provided, the debugger selects the invocation path of the process that generated the core file. The invocation path information is stored in the core file. If the invocation path is a relative path, you must enter the executable while debugging the core file.
To start debugging the process, at the gdb Prompt, invoke the core file:(gdb) core core
Check Status:(gdb) status
View Data:(gdb) data
View Stacks:(gdb) stack
Analyze a Stack by its Number:(gdb) frame number
View Code around that Stack:(gdb) list
List Variables:(gdb) info locals
View Files:(gdb) files
View Internals:(gdb) internals
View Command Aliases:(gdb) aliases
Check Support Facilities:(gdb) support
Running Program:(gdb) running
Analyze Tracing of Program Execution without Stopping it:(gdb) tracepoints
User-defined Commands:(gdb) user-defined
Get Obscure Features:(gdb) obscure
View the stack trace:(gdb) backtrace
List all threads of the process at the time of the crash:(gdb) info thread
View the specified thread:(gdb) thread thread_id
Disassembly a specified section in the memory:(gdb) disassemble <address>
Displays memory information of a specified address:(gdb) x / s <address>
Display the contents the registers:(gdb) info registers
Display all registers, including floating point registers:(gdb) i all
Display informations about all of the shared libraries:(gdb) info shared
Prints the target that is currently under the debugger:(gdb) info files(gdb) info target
To view the source code:(gdb) list
Start the target program:(gdb) run
Set a breakpoint:(gdb) break sum
On a line number:(gdb) b 25
On a n offset on the current line:(gdb) b +9(gdb) b -1
On a memory address (use *):(gdb) b *00x2324
Set a watchpoint on a variable or expression:(gdb) watch x(gdb) watch_target &x(gdb) watch_target (<type of x> *) *<addr of x>
Display a list of breakpoints and watchpoints:(gdb) info break(gdb) info watch
Help:(gdb) help
1.5 Debug Process and Analyze Core by Using truss on Solaris
Trace System Calls of a Process or Command:truss -p pidtruss -p 2975/3truss /usr/local/sbin/snmpd
Trace System Calls, Faults and Signals of a Process or Command and Count them:truss -c -p pid
Trace a Process, Follow its Children and Count Syscalls, Faults and Signal:truss -cf -p pid
Trace System Call, its Environment Strings and Timestamp for a Process (and Put it on a File):truss -d -e -p 1873truss -d -e -f -o /tmp/dbstart.lst -p 2522
Trace System Calls of a Process and Include a Time Delta on Each Line of Trace Ouput:truss -d -D -p 1473
Trace a Process Including Timestamp on Each Line and Include / Exclude Specific System Calls (in this case “read” Syscalls):truss -d -t read -p 1468truss -d -t !read -p 1468
Trace a “find” and put the output on a file:truss find . -print >find.out
Trace of the “open”, “close”, “read”, and “write” System Calls:truss -t open,close,read,write find . -print >find.out
Trace a Shell Script:truss -f -o truss.out spell document
Abbreviating Output:truss nroff -mm document >nroff.out
Because 97% of the output reports lseek(), read(), and write() system calls, to abbreviate it:truss -t !lseek,read,write nroff -mm document >nroff.out
Tracing library calls from within the C library:truss -u libc
Trace all user-level calls made to any library other than the C library:truss -u '*' -u !libc –p 1544
Tracing all user-level printf and scanf function calls in the C library:truss -u 'libc:*printf,*scanf' –p 1100
Trace every user-level function call from any-where to anywhere:truss -u a.out -u ld:: -u :: ...
Trace the system call activity of process #1, init:truss -p -v all 1
Trace a Process exec() Syscalls and Follow its Children:truss -ftexec -p pid 2> /dev/null &
Trace System Calls of an Oracle Listener and its Timestamp and Put the Output in the File “lsnrctl.truss”:truss -d -o lsnrctl.truss -p 3949
Trace All of the System Calls of the “pgrep” command in a File:truss -o /var/tmp/syslog.truss.out -sall -p `pgrep syslogd`
Trace the System Calls and its Forks, show arguments passed to the exec calls, and the environment variables:truss –aef -p <PID>
OR:truss -aef lsnrctl dbsnmp_start
Trace the System Calls and its Forks, show arguments passed to the exec calls, the environment variables, and the full contents of the I/O buffer for each read() and write() on any of the specified file descriptors and follow its children:truss –aef -rall -wall -p <PID>
Trace the full contents of the I/O buffer for each read() and write() on any of the specified file descriptors and follow its children:truss -rall -wall -f -p <PID>
Trace verbosely the full contents of the I/O buffer for each read() and write() on any of the specified file descriptors and follow its children:truss -wall -rall -vall -f /usr/local/sbin/snmpd
Verbosely Trace init:truss -p -v all 1
Trace the Machine Faults:truss –mall –p 1200
Exclude the Machine Faults from the Trace:truss –m!all –p 1200
Machine Faults that Stops the Process (If one of the specified faults is incurred, truss leaves the process stopped and abandoned):truss –Mall –p 1200
Run truss to Debug read() and write() syscalls as Oracle Listener/DBSnmp Starts:truss -rall -wall lsnrctl start
Count Total CPU Seconds per System Calls:truss -c dd if=500m of=/dev/null bs=16k count=2k
OR:truss -d -u a.out,libc dd if=500m of=/dev/null bs=16k count=2kmore a.out
Trace allthe syscalls, threads and API functions for CORBA-based process:truss -t!all -s!all -u libit_*::CORBA* -p 21922
1.6 Analyze Process by using dbx
To invoke dbx:dbx program_name
OR:dbx pid
OR:
dbx –a pid
OR:dbx -d 100 program_name core_file
OR:dbx -d 100 -a pid
OR:dbx - `pgrep Freeway`
At dbx Prompt:(dbx) run(dbx) where(dbx) status
Analyze Process Core by Using dbx:dbx program_name core
OR:dbx - core
OR:dbx a.out core
At dbx Prompt:(dbx) run(dbx) where(dbx) threads(dbx) status(dbx) list main (dbx) print msg(dbx) check -access(dbx) check -memuse(dbx) help(dbx) quit
1.7 Generate a Process Core Dump on Solaris
coreadm
OR:savecore -d
If after enabling core file generation your system still does not create a core file, you may need to change the file-size writing limits set by your operating system:ulimit -aulimit -c unlimitedulimit -H -c unlimited
Enable Applications to Generate Core Files:coreadm -g /path-to-file/%f.%n.%p.core -e global -e process -e global-setid -e proc-setid -e log
1.8 Create Process Core
echo lsgcore ls
1.9 Examining Memory Address Spaces with mdb on Solaris
prstattopps -ef | grep pidpmap -x 919mdb -k
Load the dmod containing the new dcmd:::load /wd320/max/source/mdb/segpages/i386/segpages.so
Walk through the Segments of the Process Address Space, showing Each Virtual Page in the Segment:0t919::pid2proc | ::print proc_t p_as | ::walk seg | ::segpages
Count the Pages currently Valid for the Process:0t919::pid2proc | ::print proc_t p_as | ::walk seg | ::segpages !grep -i " valid" | wc
Count the Pages in Memory Not currently Valid in the Page Table(s) for the Process:0t919::pid2proc | ::print proc_t p_as | ::walk seg | ::segpages !egrep -i "inmemory" | wc
How Many Pages are Currently Not Valid (and Not in Memory):0t919::pid2proc | ::print proc_t p_as | ::walk seg | ::segpages !egrep -i " invalid$" | wc
How Large is the Address Space (this should be the total size as reported by pmap):0t919::pid2proc | ::print proc_t p_as | ::walk seg | ::segpages !egrep -v OFFSET | wc
How Many Pages have been Swapped Out:0t919::pid2proc | ::print proc_t p_as | ::walk seg | ::segpages !grep -i swapped | wcpmap -x 919
1.10 Debug Kernel, System Calls and Processes (DTRACE)
dtrace -f /usr/local/sbin/snmpddtrace -l -n tcp::entrydtrace -l -m tcpdtrace -lv -n fbt:tcp:_info:entrydtrace -n 'ufs_read:entry { printf("%s\n",stringof(args[0]->v_path));}'
Get wich Process is making more SysCalls:dtrace -n 'syscall:::entry { @[execname] = count(); }'
OR:dtrace -n 'syscall::read:entry { @[execname,pid]=count()}'
Get new Processes with Arguments:dtrace -n 'proc:::exec-success { trace(curpsinfo->pr_psargs); }'
Files opened by process:dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }' Pages paged in by process:dtrace -n 'vminfo:::pgpgin { @pg[execname] = sum(arg0); }' Minor faults by process:dtrace -n 'vminfo:::as_fault { @mem[execname] = sum(arg0); }'
System Calls Count by Name:dtrace -n 'syscall:::entry { @syscalls[probefunc] = count(); }'
Syscall Count by Program:dtrace -n 'syscall:::entry { @num[execname] = count(); }' Syscall Count by Syscall:dtrace -n 'syscall:::entry { @num[probefunc] = count(); }' Syscall Count by Process:dtrace -n 'syscall:::entry { @num[pid,execname] = count(); }'
Syscalls by Type:dtrace -n 'syscall:::entry { @[probefunc] = count(); }'
Match the syscall probe only when the execname matches our investigation target, filebench, and count the syscall name:dtrace -n 'syscall:::entry /execname == "filebench"/ { @[probefunc] = count(); }'
Kernel:Kernel Profiling:dtrace -n 'profile-997ms / arg0 != 0 / { @ks[stack()]=count() }'
Counting xcalls:dtrace -n 'xcalls { @[probefunc] = count() }'
Probe Virtual Memory Info on Running Staroffice Process:dtrace -P vminfo/execname == "soffice.bin"/{@[probename] = count()}dtrace -s ./soffice.d
Successful Signal Details:dtrace -n 'proc:::signal-send /pid/ { printf("%s -%d %d",execname,args[2],args[1]->pr_pid); }'
Kernel stack trace profile at 1001 Hertz:dtrace -n 'profile-1001 { @[stack()] = count(); }'
Thread off-cpu stack trace count:dtrace -n 'sched:::off-cpu { @[stack()] = count(); }'
Adaptive lock block time totals (ns) by kernel stack trace:dtrace -qn 'lockstat:::adaptive-block { @[stack(5), "^^^ total ns:"] = sum(arg1); }'
Kernel function call counts for module "zfs" by module:dtrace -n 'fbt:zfs::entry { @[probefunc] = count(); }'
Kernel function call counts for functions beginning with "hfs_" by module:dtrace -n 'fbt::hfs_*:entry { @[probefunc] = count(); }'
Kernel stack back trace counts for calls to function "arc_read()" (for example):dtrace -n 'fbt::arc_read:entry { @[stack()] = count(); }'
Identify kernel stacks calling disk I/O:dtrace -n 'io:::start { @[stack()] = count(); }'
Trace errors along with disk and error number:dtrace -n 'io:::done /args[0]->b_flags & B_ERROR/ { printf("%s err: %d", args[1]->dev_statname, args[0]->b_error); }'
Look at what is calling semsys:dtrace -n 'syscall::semsys:entry /execname == "filebench"/ { @[ustack()] = count();}'
Probe Functions:dtrace -n 'syscall:::entry { @scalls[probefunc] = count() }'
Check which Process is Creating Threads:dtrace -n 'thread_create:entry { @[execname]=count()}'
CPU:What are the top user functions running on CPU (% usr time)?dtrace -n 'profile-997hz /arg1/ { @[execname, ufunc(arg1)] = count(); }'
What are the top 5 kernel stack traces on CPU (shows why)?dtrace -n 'profile-997hz { @[stack()] = count(); } END { trunc(@, 5); }'
What threads are on CPU, counted by their thread name? (FreeBSD)dtrace -n 'profile-997 { @[stringof(curthread->td_name)] = count(); }'
What system calls are being executed by the CPUs?dtrace -n 'syscall:::entry { @[probefunc] = count(); }'
Which processes are executing the most system calls?dtrace -n 'syscall:::entry { @[pid, execname] = count(); }'
Get Interrupts by CPU:dtrace -n 'sdt:::interrupt-start { @num[cpu] = count(); }'
Get Functions by Process by CPU:dtrace -n 'pid221:libc::entry'
Find what is Context Switching Much onto the CPU:dtrace -n 'sched:::on-cpu { @[execname] = count(); } profile:::tick-20s { exit(0); }'
Memory:
Tracking memory page faults by process name:dtrace -n 'vminfo:::as_fault { @mem[execname] = sum(arg0); }'
Process allocation (via malloc()) requested size distribution plot:dtrace -n 'pid$target::malloc:entry { @ = quantize(arg0); }' -p PID
Process allocation (via malloc()) by user stack trace and total requested size:dtrace -n 'pid$target::malloc:entry { @[ustack()] = sum(arg0); }' -p PID
File System:Trace file creat() calls with file and process name:dtrace -n 'syscall::creat*:entry { printf("%s %s", execname, copyinstr(arg0)); }'
Frequency count stat() files:dtrace -n 'syscall::stat*:entry { @[copyinstr(arg0)] = count(); }'
Tracing "cd":dtrace -n 'syscall::chdir:entry { printf("%s -> %s", cwd, copyinstr(arg0)); }'
Count read/write syscalls by syscall type:dtrace -n 'syscall::*read*:entry,syscall::*write*:entry { @[probefunc] = count(); }'
Syscall read(2) by file name:dtrace -n 'syscall::read:entry { @[fds[arg0].fi_pathname] = count(); }'
Syscall write(2) by file name:dtrace -n 'syscall::write:entry { @[fds[arg0].fi_pathname] = count(); }'
Syscall read(2) by filesystem type:dtrace -n 'syscall::read:entry { @[fds[arg0].fi_fs] = count(); }'
Syscall write(2) by filesystem type:dtrace -n 'syscall::write:entry { @[fds[arg0].fi_fs] = count(); }'
Syscall read(2) by process name for the "zfs" filesystem only:dtrace -n 'syscall::read:entry /fds[arg0].fi_fs == "zfs"/ { @[execname] = count(); }'
Syscall write(2) by process name and filesystem type:dtrace -n 'syscall::write:entry { @[execname, fds[arg0].fi_fs] = count(); } END { printa("%18s %16s %16@d\n", @); }'
Check Write Entries:dtrace -n 'syscall::write:entry { trace(arg2) }'dtrace -n 'fbt:ufs:ufs_write:entry { printf("%s\n",stringof(args[0]->v_path)); }'
Identify who's responsible for to Much Reading:dtrace -n 'syscall::read:entry { @Execs[execname] = count(); }'dtrace -n 'syscall::open:entry { @Open[copyinstr(arg0)] = count(); }'dtrace -n 'syscall::exec*:entry { trace(execname); }'
Drive into Complex Structures:dtrace -qn 'syscall::exec*:entry { printf("%5d %s\n",pid,stringof(curpsinfo->pr_psargs)); }'
Count All ioctl System Calls by Both Executable Name and File Descriptor:
dtrace -n 'syscall::ioctl:entry { @[execname, arg0] = count(); }'
Distribution of Write Size by Executable Name:dtrace -n 'syscall::write:entry { @[execname] = quantize(arg2); }'
Read bytes by process:dtrace -n 'sysinfo:::readch { @bytes[execname] = sum(arg0); }' Write bytes by process:dtrace -n 'sysinfo:::writech { @bytes[execname] = sum(arg0); }' Read size distribution by process:dtrace -n 'sysinfo:::readch { @dist[execname] = quantize(arg0); }' Write size distribution by process:dtrace -n 'sysinfo:::writech { @dist[execname] = quantize(arg0); }' Disk size by process:dtrace -n 'io:::start { printf("%d %s %d",pid,execname,args[0]->b_bcount); }'
Chase the Hot Lock Caller:dtrace -n 'pr_p_lock:entry { @s[stack()]=count() }'dtrace -n 'pr_p_lock:entry { @s[execname]=count() }'prep process_namedtrace -n 'pid4485:libc:pread:entry { @us[ustack()]=count() }'
Check UFS Read:dtrace -q -n 'ufs_read:entry { printf("UFS Read: %s\n",stringof(args[0]->v_path)); }'dtrace -q -n 'ufs_read:entry { @[execname,stringof(args[0]->v_path)]=count() }'
Show disk I/O size as distribution plots, by process name:dtrace -n 'io:::start { @size[execname] = quantize(args[0]->b_bcount); }'
Processes paging in from the filesystem:dtrace -n 'vminfo:::fspgin { @[execname] = sum(arg0); }'
Which processes are executing common I/O system calls:dtrace -n 'syscall::*read:entry,syscall::*write:entry { @rw[execname,probefunc] = count(); }'
What is the rate of disk I/O being issued:dtrace -n 'io:::start { @io = count(); } tick-1sec { printa("Disk I/Os per second: %@d\n", @io); trunc(@io); }'
NFSv3 count of operations by client address:dtrace -n 'nfsv3:::op-*-start { @[args[0]->ci_remote] = count(); }'
NFSv3 count of operations by file pathname:dtrace -n 'nfsv3:::op-*-start { @[args[1]->noi_curpath] = count(); }'
Socket Provider:Socket accepts by process name:dtrace -n 'syscall::accept*:entry { @[execname] = count(); }'
Socket connections by process and user stack trace:
dtrace -n 'syscall::connect*:entry { trace(execname); ustack(); }'
mib Provider:IP event statistics:dtrace -n 'mib:::ip* { @[probename] = sum(arg0); }'
TCP event statistics with kernel function:dtrace -n 'mib:::tcp* { @[strjoin(probefunc, strjoin("() -> ", probename))] = sum(arg0);}'
IP Provider:Received IP packets by host address:dtrace -n 'ip:::receive { @[args[2]->ip_saddr] = count(); }'
IP send payload size distribution by destination:dtrace -n 'ip:::send { @[args[2]->ip_daddr] = quantize(args[2]->ip_plength); }'
TCP Provider:Who is connecting to what:dtrace -n 'tcp:::accept-established { @[args[3]->tcps_raddr, args[3]->tcps_lport] = count(); }'
Who isn't connecting to what:dtrace -n 'tcp:::accept-refused { @[args[2]->ip_daddr, args[4]->tcp_sport] = count(); }'
What am I connecting to?dtrace -n 'tcp:::connect-established { @[args[3]->tcps_raddr , args[3]->tcps_rport] = count(); }'
IP payload bytes for TCP send, size distribution by destination address:dtrace -n 'tcp:::send { @[args[2]->ip_daddr] = quantize(args[2]->ip_plength); }'
MySQL:MySQL: query trace by query string:dtrace -n 'mysql*:::query-start { trace(copyinstr(arg0)) }'
MySQL: query count summary by host:dtrace -n 'mysql*:::query-start { @[copyinstr(arg4)] = count(); }'
MySQL server: trace queries:dtrace -qn 'pid$target::*mysql_parse*:entry { printf("%Y %s\n", walltimestamp, copyinstr(arg1)); }' -p PID
MySQL client: who's doing what (stack trace by query):dtrace -Zn 'pid$target:libmysql*:mysql_*query:entry { trace(copyinstr(arg1)); ustack(); }' -p PID
1.11 Other Debugging Tools on Solaris
gcore:Take a snapshot of a process:gcore –o output_filename pid
kill:Kill a process and generate its core dump:kill -SEGV <pid>
lsof:Get File Open by the Specified Process/Command:lsof -p 28290lsof -a -p 28290
Check How Many Instances of “sendmail” are Open:lsof -c sendmail
File Descriptors Number:ps -efcd /proc/28290/fdls -l | wc -l
Get File Open by the Specified User:lsof -u root
Get FileSystem iNodes:lsof -i /fs
Check Open Files on the specified File System and Processes the use it:lsof /fs
Check How Many Instances of “sendmail” are Open:lsof -c sendmail
Check iNodes Usage on the specified File System:lsof –i /fs
List All Open Files for the User “abe” and for the Specified Process IDs:lsof -p 456,123,789 -u 1234,abe
Find processes with open files on the NFS filesystem /nfs/mount/point whose server is inaccessible, and presuming your mount table supplies the device number for /nfs/mount/point:lsof -b /nfs/mount/pointSend a SIGHUP signal to All of the Processes that have “/u/abe/bar” Open:kill -HUP 'lsof -t /u/abe/bar'
Ignore the Device Cache File:lsof –Di
Get PID and command name field output for each process, file descriptor, file device number, and file inode number for each file of each process:lsof –FpcfDi
List the files at descriptors 1 and 3 of every process running the lsof command for login ID ''abe'' every 10 seconds:lsof -c lsof -a -d 1 -d 3 -u abe -r10
List All Files using Any Protocol on Any Port of mace.cc.org:lsof -i @mace
List All Files using Any Protocol on the Specified Port Range of mace.cc.org:lsof -i @mace:123-140
List All IPv4 Network Files in Use whose PID is 1234:lsof -i 4 -a -p 1234
fuser:Get Processes and related Username Running on the /var File System:fuser -uc /var
Get Process IDs and Login Names that have the /etc/passwd Files Open:fuser -u /etc/passwd
Reports on the File System and Files, restricting the output to Processes that hold Non-blocking Mandatory Locks:fuser -cn /export/foo
Kill Processes Running on the /var File System:fuser -ku /var
Send SIGTERM to Any Processes that hold a Non-blocking Mandatory Lock on the File /export/foo/my_file:fuser -fn -s term /export/foo/my_file
Get Processes Running on the / File System and Print the Processes Name and Arguments:ps -o pid,args -p "$(fuser / 2>/dev/null)"
Report Device Usage Informations:fuser –d /dev/dsk/c0t0d0
2. Debug Process and Analyze Process Core on Linux
2.1 Debug Processes by using STRACE
Trace the "ls" Command;strace ls
Trace the "open" System Call of the "ls" Command:strace -e open ls
Trace "open" and "read" System Calls of the "ls" Command:strace -e trace=open,read ls /home
Trace rsync and Log to File:strace -o /tmp/strace_ls_output.txt rsync
Trace a Process by PID and Log to File:strace -o /tmp/strace_rsync_21.06.txt -p pid
Trace "ls" Command and Print Relative Time for System Calls:
strace -r ls
Generate Statistics Report of System Calls for "ls" Command:strace -c ls /home
Trace All System Calls which have a filename as an argument:strace -o /tmp/strace_rsync_output.txt -e trace=file -p pid
Trace All Network Related System Calls:strace -o /tmp/strace_rsync_output.txt -e trace=network -p pid
Trace All File Descriptor Related System Calls:strace -o /tmp/strace_rsync_output.txt -e trace=desc -p pid
# -e verbose=all is the default verbosity.strace –tttT –o /tmp/s1.lst –p 2395strace -ttT -p 5164
2.2 Analyze a Process Core Dump with gdb on Linux
Start gdb on core file:gdb -c core
OR:gdb a.out core
OR:gdb path/to/the/binary path/to/the/core
OR by gdb Prompt:(gdb) core core
If the executable path is not provided, the debugger selects the invocation path of the process that generated the core file. The invocation path information is stored in the core file. If the invocation path is a relative path, you must enter the executable while debugging the core file.
To start debugging the process, at the gdb Prompt, invoke the core file:(gdb) core core
Check Status:(gdb) status
View Data:(gdb) data
View Stacks:(gdb) stack
Analyze a Stack by its Number:(gdb) frame number
View Code around that Stack:(gdb) list
List Variables:(gdb) info locals
View Files:(gdb) files
View Internals:(gdb) internals
View Command Aliases:(gdb) aliases
Check Support Facilities:(gdb) support
Running Program:(gdb) running
Analyze Tracing of Program Execution without Stopping it:(gdb) tracepoints
User-defined Commands:(gdb) user-defined
Get Obscure Features:(gdb) obscure
View the stack trace:(gdb) backtrace
List all threads of the process at the time of the crash:(gdb) info thread
View the specified thread:(gdb) thread thread_id
Disassembly a specified section in the memory:(gdb) disassemble <address>
Displays memory information of a specified address:(gdb) x / s <address>
Display the contents the registers:(gdb) info registers
Display all registers, including floating point registers:(gdb) i all
Display informations about all of the shared libraries:(gdb) info shared
Prints the target that is currently under the debugger:
(gdb) info files(gdb) info target
To view the source code:(gdb) list
Start the target program:(gdb) run
Set a breakpoint:(gdb) break sum
On a line number:(gdb) b 25
On a n offset on the current line:(gdb) b +9(gdb) b -1
On a memory address (use *):(gdb) b *00x2324
Set a watchpoint on a variable or expression:(gdb) watch x(gdb) watch_target &x(gdb) watch_target (<type of x> *) *<addr of x>
Display a list of breakpoints and watchpoints:(gdb) info break(gdb) info watch
Help:(gdb) help
2.3 Analyze the Process Core using dbx on Linux
To invoke dbx:dbx a.out core
To invoke dbx:dbx program_name
OR:dbx pid
OR:dbx –a pid
OR:dbx -d 100 program_name core_file
OR:dbx -d 100 -a pid
At dbx Prompt:(dbx) run(dbx) where(dbx) threads(dbx) status(dbx) list main (dbx) print msg(dbx) check -access(dbx) check -memuse(dbx) help(dbx) quit
2.4 Analyze a Core Dump Using Oprofile on Linux
OProfile is a Linux system-wide Profiling Tool to Profile and Analyze Performance and Runtime Problems with Applications, or the Kernel.
Gunzip the Kernel:cd /bootgunzip vmlinux-<something>.gz
Run OProfile without Profiling the Kernel:opcontrol --no-vmlinux
If you do want to Profile the Kernel:opcontrol --vmlinux=/boot/vmlinux-`uname -r`
Start Collecting Data:opcontrol --start
Dump the Collected Data:opcontrol --dump
Stop Oprofile:opcontrol --stop
If you want to Reset Profiling Counters:opcontrol --reset
Report Collected Data:opreport
To Collect More Info:opcontrol --symbols
OR:opcontrol -l
To Create a Graph:opcontrol -c
2.5 Debug Libraries and Symbols on Linux
Trace Calls to the Library Function for the "ls" Command:ltrace /usr/bin/who
Trace Calls to the Library Function for the "ls" Command and Log to File:ltrace -o ls.tr ls
Trace All System Calls to the Library Function and Log to File:ltrace -S -o ls.tr ls
Check Linked Libraries:ldd filename
Check Module Info:modinfo module_name.ko
The Names of the Files Containing the Object Code and Symbols for Libraries are in the ELF File.
To Read ELF File:readelf program_of_interest | less
Disassembly a Program:objdump -D -S <compiled_object_with_debug_symbols> > filename.outobjdump -d -S module_name.ko > /tmp/whatever
List Symbols:nm /usr/bin/who
2.6 Other Debugging Tools on Linux
gcore:Take a snapshot of a process:gcore –o output_filename pid
kill:Kill a process and generate its core dump:kill -SEGV <pid>
lsof:Get File Open by the Specified Process/Command:lsof -p 28290lsof -a -p 28290
Check How Many Instances of “sendmail” are Open:lsof -c sendmail
File Descriptors Number:ps -ef
cd /proc/28290/fdls -l | wc -l
Get File Open by the Specified User:lsof -u root
Get FileSystem iNodes:lsof -i /fs
Check Open Files on the specified File System and Processes the use it:lsof /fs
Check How Many Instances of “sendmail” are Open:lsof -c sendmail
Check iNodes Usage on the specified File System:lsof –i /fs
List All Open Files for the User “abe” and for the Specified Process IDs:lsof -p 456,123,789 -u 1234,abe
Find processes with open files on the NFS filesystem /nfs/mount/point whose server is inaccessible, and presuming your mount table supplies the device number for /nfs/mount/point:lsof -b /nfs/mount/pointSend a SIGHUP signal to All of the Processes that have “/u/abe/bar” Open:kill -HUP 'lsof -t /u/abe/bar'
Ignore the Device Cache File:lsof –Di
Get PID and command name field output for each process, file descriptor, file device number, and file inode number for each file of each process:lsof –FpcfDi
List the files at descriptors 1 and 3 of every process running the lsof command for login ID ''abe'' every 10 seconds:lsof -c lsof -a -d 1 -d 3 -u abe -r10
List All Files using Any Protocol on Any Port of mace.cc.org:lsof -i @mace
List All Files using Any Protocol on the Specified Port Range of mace.cc.org:lsof -i @mace:123-140
List All IPv4 Network Files in Use whose PID is 1234:lsof -i 4 -a -p 1234
File Open by a Process:ps -efcd /proc/28290/fdls -lrt
Process Info:cd /proc/28290
ls -l
more statusmore limitsmore iomore mountsmore mountstat
fuser:Get Process IDs and Login Names that have the /etc/passwd Files Open:fuser -u /etc/passwd
Get Verbone Info Including Process IDs and Login Names that have the /etc/passwd Files Open:fuser -vu /etc/passwd
Kill Processes Accessing the /var File System in Any Way:fuser -km /var
Get Processes Running on the / File System and Print the Processes Name and Arguments:ps -o pid,args -p "$(fuser / 2>/dev/null)"
If there’s No Process on the specified Device, then Execute the xxx Command:if fuser -s /dev/ttyS1; then :; else something; fi
Show All Processes at the Local Telnet Port:fuser telnet/tcp
3. Debug Process and Analyze Process Core on HP-UX
3.1 Debug Processes by using tusc
Trace a Process System Calls:tusc pid
Trace a Process System Calls and Count it:tusc -c pidtusc -cc pidtusc –ccc pid
Trace a Process System Calls and Count it adding more Informations:tusc -C pidtusc -cC pid
Trace a “init” System Calls, Count it adding more Informations and Print Process Names:tusc -cCn 1
Trace Verbosely a Process System Calls and Follow Forks:
tusc -vf pid
Trace a Process System Calls, Follow Forks and Print Process Names:tusc -fn pid
Trace Verbosely a Process System Calls, Follow Forks and Print Process ID:tusc -vfp pid
Trace a “bdf /” System Calls and its Forks, Count it adding more Informations and Print Process Names:tusc -fcCn “bdf /”
Trace Verbosely a “bdf /” System Calls and its Forks, and Print Process Names and Timestamp for Each Syscall and Signal:tusc -vfnT “bdf /”
Trace a “bdf /” System Calls and its Forks, Count it adding more Informations, Print Process Names and Execution Time:tusc -fcCnD “bdf /”
Trace a “bdf /” System Calls and its Forks, and Print Process Names, Duration Time and Timestamp for Each Syscall and Signal:tusc -fnDT “bdf /”
Trace Verbosely a “bdf /” System Calls and its Forks, and Print Process Names, Duration Time and Timestamp for Each Syscall and Signal:tusc -vfnDT “bdf /”
Trace a “bdf /” System Calls and its Forks, Count it adding more Informations, Print Process Names, Execution Time and Timestamp for Each Syscall and Signal:tusc -fcCnDT “bdf /”
Trace a Process System Calls, Follow Forks and Keep Tracing Parent even if Parent Exits:tusc -fk pid
Trace a Process System Calls, Printing Process Names and Timestamp for Each Syscall and Signal, and Detach Process if it Enters Traced Mode:tusc -tnT 455
Trace a Process System Calls Concentrating on exec() Functions:tusc -sexec pid 2> /dev/null &
Trace a Process System Calls and File Descriptors and Log to File (lsnrctl.truss):tusc -d -o lsnrctl.truss -p 3949
Trace a Process System Calls and the Specified File Descriptors and Log to File (lsnrctl.truss):tusc -dFileDescriptors -o lsnrctl.truss -p 3949
Trace Verbosely All of the System Calls of the “ps -ef” command:tusc –v -o /var/tmp/syslog.truss.out -sall -p “ps -ef”
Trace a Process System Calls, and Print Read Buffers for All of the File Descriptors:tusc -rall <PID>
Trace a Process System Calls its Forks, and Print the Read Buffers for the Specified File Descriptors:/usr/local/bin/tusc -f -r 3,4,5,6 -o /tmp/trace_results /usr/local/sbin/snmpd
Trace a Process System Calls and its Forks, and Print Read and Write Buffers for given File Descriptors, but don’t show Sleeping Syscalls:tusc -rall -wall -f <PID>
Trace a Process System Calls and its Forks, and Print Read and Write Buffers for given File Descriptors:tusc -rall -wall -f -i <PID>
Trace a Process System Calls and its Forks, and Print Execution Time:tusc –f -D /usr/local/sbin/snmpd
Trace a Process System Calls and its Forks, and Print exec Arguments and Execution Time:tusc –f -a -D /usr/local/sbin/snmpd
Trace a Process System Calls and Execution Time and Count it:tusc -c sqlplus "/ as sysdba" << EOFexit;EOF
Trace a Process System Calls and Execution Time:tusc -d sqlplus "/ as sysdba" << EOFexit;EOF
Trace Specific System Calls:tusc –s syscall_name 455
Trace Specific Signals:tusc –S syscall_name 455
Execute syslog-ng, follow children, print timestamps and Send output to /tmp/tusc.out:tusc -faepo /tmp/tusc.out -v -T %H:%M:%S /opt/syslog-ng/sbin/syslog-ng
Execute sqlplus, follow children, print timestamps and Send output to /tmp/tusc.out:tusc -faepo /tmp/tusc.out -v -T %H:%M:%S sqlplus scott/tiger
Execute sqlplus, follow children and Send output to /tmp/tusc.out:tusc -faepo /tmp/tusc.out -v sqlplus scott/tiger
Attach to a running process and Send output to /tmp/tusc.out:tusc -faepo /tmp/tusc.out -v -T %H:%M:%S -p <pid>
tusc -faepo /tmp/tusc.out -v -p <pid>tusc -faepo /tmp/tusc.out -p <pid>
Unless advised otherwise, the minimum options used should be:tusc -faepo <output file> ....
Trace Verbosely System Calls and its Forks, Print Environment Variables, Process Names, PIDs, Timestamps and Duration Time:tusc -e -n -p -T '%T' -D -f –v pid
Run truss on Log Files to Detect System Problems:tail -f /var/adm/SYSLOGtail -f /var/adm/messagestail -f /var/log/syslog/usr/local/bin/sstep ls
Find the PIDs of the Processes to Trace:function get_pid { (echo foo 0 ${1};ps -ef)| grep ${1} | grep -v "grep *${1}" | tail -1| awk '{if ($2 > 0) {print $2} else {print ""}}'}/opt/tusc/bin/tusc -o /tmp/tusc.log -v -r all -w all -p -T "%d.%m.%Y %H:%M:%S" `get_pid WorkManager`
OR with Multiple “get_pid”:/opt/tusc/bin/tusc -o /tmp/tusc.log -v -r all -w all -p -T "%d.%m.%Y %H:%M:%S" `get_pid WorkManager` `get_pid SolidDesigner` `get_pid MEls`
3.2 Instaling tusc on HP-UX 11.xx
Download the tusc package for your HP-UX version and architecture at the following address:http://hpux.connect.org.uk/hppd/cgi-bin/search?term=tusc&Search=Search
Create a temporary directory and upload the depot onto it:mkdir /tmp/tempo_inst
Access the temporary directory and gunzip the depotcd /tmp/tempo_instgunzip tusc-x.x-xxxx-11.xx.depot.gzls -l
Install the depot package by using one the following methods:
a) By using swinstall (recommended):swinstall -s tusc-x.x-xxxx-11.xx.depot
OR
b) By manually extract the tarball and copying the content to the appropriate directories:tar –xf tusc-x.x-xxxx-11.xx.depot
Access the bin subdir of the depot directory and copy its content to the /bin directory:cd tusc/tusc-RUN/usr/local/bin/cp * /bin/
Access the man subdir of the depot directory and copy its content to the /usr/local/man/man1 directory:cd ../man/cp man1/tusc.1 /usr/local/man/man1/
3.3 Debug Processes and Core Files by using HP WDB / GDB
The HP Wildebeest Debugger (WDB) is an HP-supported implementation of the Open Source GNU debugger (GDB).
HP WDB / GDB can be used to debug / monitor a process, but it mostly used to analyze crashed processes core files and system’s crash dumps.
Check if HP WDB is installed:swlist -l fileset | grep -i wdb
If HP WDB is not installed, you can download the latest version (6.3) for your HP-UX version and architecture from here: you need an HP AllianceONE account with appropriate provileges.
Upload the depot file onto the server’s /tmp directory, access the directory and decompress it:cd /tmp gunzip hpwdb.xxxx.xxxx.depot.gz
Install the depot:swinstall –s hpwdb.xxxx.xxxx.depot/*
The main path are:/opt/langtools/wdb/opt/langtools/gdb/opt/langtools/bin
To monitor/debug a process:gdb –crashdebug pid
Before analyzing a process core file, check it:file corefile_namestrings corefile_name
Check if it’s truncated:elfdump -o -S core
To analyze a process core file generated by the snmpd daemon:gdb /usr/bin/snmpd core
OR:gdb /usr/bin/snmpd -c core
OR:gdb -c core
OR:gdb /usr/bin/snmpd
OR to start the HP WDB GUI:wdb /usr/bin/snmpd
OR to start gdb with XDB support:gdb –xdb /usr/bin/snmpd
OR to start gdb with XDB support using Terminal User Interface:gdb –xdb -tui /usr/bin/snmpd
OR:gdb
At gdb Prompt:(gdb) core core
If the executable path is not provided, the debugger selects the invocation path of the process that generated the core file. The invocation path information is stored in the core file. If the invocation path is a relative path, you must enter the executable while debugging the core file.
To start debugging the process, at the gdb Prompt, invoke the core file:(gdb) core core
View the stack trace:(gdb) backtrace
List all threads of the process at the time of the crash:(gdb) info thread
View the specified thread:(gdb) thread thread_id
Disassembly a specified section in the memory:(gdb) disassemble <address>
Displays memory information of a specified address:(gdb) x / s <address>
Display the contents the registers:(gdb) info registers
Display all registers, including floating point registers:(gdb) i all
Display informations about all of the shared libraries:(gdb) info shared
Prints the target that is currently under the debugger:(gdb) info files(gdb) info target
To view the source code:
(gdb) list
Start the target program:(gdb) run
Set a breakpoint:(gdb) break sum
On a line number:(gdb) b 25
On a n offset on the current line:(gdb) b +9(gdb) b -1
On a memory address (use *):(gdb) b *00x2324
Set a watchpoint on a variable or expression:(gdb) watch x(gdb) watch_target &x(gdb) watch_target (<type of x> *) *<addr of x>
Display a list of breakpoints and watchpoints:(gdb) info break(gdb) info watch
Force a core dump and create a core image file for the process under the debugger:(gdb) dumpcore core_filename
Pack the core file along with the relevant executable and libraries in a single tar file for core file debugging on another system:(gdb) packcore
Unpack the tar file that is generated by the “packcore” command so the debugger can use the executable and shared libraries from this bundle, when debugging the core file on a different system from the one on which the core file was originally created:(gdb) unpackcore
To Debug Memory with gdbSet heap checking options:(gdb) set heap-check [option][on/off]
Detection leaks:(gdb) set heap-check leaks [on/off]
Detect double-frees and free improper arguments:(gdb) set heap-check free [on/off]
Check for out-of-bounds corruption:(gdb) set heap-check bounds [on/off]
Set the number of frames to be printed for leak and heap profiles:(gdb) set heap-check frame-count [num]
Produce a heap allocations report:(gdb) info heap [heap.out]
Produce a memory leak report:(gdb) info leaks [leaks.out]
Lists the potential in-block corruptions in all the freed blocks:(gdb) info corruption
Search for a Pattern in the Memory Address Space(gdb) find &str[0], &str[15], "string_to_search"
(gdb) find &a[0], &a[10], "el",'l'where&a[0] Specifies the start address of the memory address range.&a[10] Specifies the end address of the memory address range.“el”, 'l' Specifies the pattern.
(gdb) find /1 &int8_search_buf[0], +sizeof(int8_search_buf), 'a', 'a', 'a'where/1 Specifies the find command to display only one matching pattern.&int8_search_buf[0] Specifies the starting address.+sizeof(int8_search_buf) Specifies the ending address.'a', 'a', 'a' Specifies the pattern (expr1, expr2, expr3).
(gdb) find /b &int8_search_buf[0], &int8_search_buf[0]+sizeof(int8_search_buf),0x61, 0x61, 0x61, 0x61where/b Specifies that the size of the pattern is 8 bits.&int8_search_buf[0] Specifies the starting address.&int8_search_buf[0]+sizeof(int8_search_buf)Specifies the ending address.0x61, 0x61, 0x61, 0x61 Specifies the pattern (expr1, expr2, expr3, exp4).
Avoid Core File CorruptionTo prevent overwriting of core files from different processes, set the kernel parametercore_addpid to 1.The core file is stored in a file name, <core.pid> in the current directory.To set the kernel parameter to prevent core file corruption, create a script called “corepid”:On HP-UX 11i v1 systemscase $1 inon) echo "core_addpid/W 1\ncore_addpid?W 1" | adb -w -k /stand/vmunix /dev/kmem;;off) echo "core_addpid/W 0\ncore_addpid?W 0" | adb -w -k /stand/vmunix /dev/kmem;;stat) echo "core_addpid/D\ncore_addpid?D" | adb -w -k /stand/vmunix /dev/kmem;;*) echo "usage $0: on|off|stat";;esac
On HP-UX 11i v2 systemscase $1 in
on) echo "core_addpid/W 1\ncore_addpid?W 1" | adb -o -w /stand/vmunix /dev/kmem;;off) echo "core_addpid/W 0\ncore_addpid?W 0" | adb -o -w /stand/vmunix /dev/kmem;;stat) echo "core_addpid/D\ncore_addpid?D" | adb -o -w /stand/vmunix /dev/kmem;;*) echo "usage $0: on|off|stat";;esac
Then, get the current settings:. /corepid stat
To enable the feature to store the core file in the file “core.pid” (set core_addpid to 1), run the script:. /corepid on
Get again the current settings to check the change:. /corepid stat
If you want to disable the feature to store the core file in the file “core.pid” (set core_addpid to 0), run the script:. /corepid off. /corepid stat
On HP-UX 11i v3 systems, use “coredm”, that allows to specify the location and pattern for core files created by abnormally terminating processes: it also allows to specify the process specific pattern for the file name of the core file.To set the global core file settings to include the process-ID and the system name in the file name of the core and to place the core file in the specified path, <path>, run:coreadm -e global -g <path>/core.%p.%n
Java Core File DebuggingHP WDB shows stack traces of mixed Java, C, and C++ programs for java corefile.The GDB_JAVA_UNWINDLIB environment variable must be set to the path name of the Java unwind library.If the Java and system libraries used by the failed application reside in non-standardlocations, then the GDB_SHLIB_PATH environment variable must be set to specify thelocation of the libraries.
Invoke gdb on a core file generated when running a 32-bit Java application on anIntegrity system with /opt/java1.4/bin/java:gdb /opt/java1.4/bin/IA64N/java core.java
Invoke gdb on a core file generated when running a 64-bit Java application on anIntegrity system with /opt/java1.4/bin/java -d64:gdb /opt/java1.4/bin/IA64W/java core.java
Invoke gdb on a core file generated when running a 32-bit Java application onPA-RISC using /opt/java1.4/bin/java:gdb /opt/java1.4/bin/PA_RISC2.0/java core.java
Invoke gdb on a core file generated when running a 64-bit Java application onPA-RISC using /opt/java1.4/bin/java:gdb /opt/java1.4/bin/PA_RISC2.0W/java core.java
3.4 Debug Processes by using truss
Trace a Process System Calls:truss -p pid
Trace a Process System Calls and Count it:truss -c -p pid
Trace a Process System Calls and Count it adding more Informations:truss -C -p pid
Trace a Process System Calls and Follow Forks:truss -f -p pid
Trace a Process System Calls Concentrating on exec() Functions:truss -sexec -p pid 2> /dev/null &
Trace a Process System Calls and File Descriptors and Log to File (lsnrctl.truss):truss -d -o lsnrctl.truss -p 3949
Trace All of the System Calls of the “ps -ef” command:truss -o /var/tmp/syslog.truss.out -sall -p “ps -ef”
Trace a Process System Calls, and Print Read Buffers for given File Descriptors:truss -rall -p <PID>
Trace a Process System Calls and its Forks, and Print Read and Write Buffers for given File Descriptors:truss -rall -wall -f -p <PID>
Trace a Process System Calls and its Forks, and Print Execution Time:truss –f -D /usr/local/sbin/snmpd
Trace a Process System Calls and its Forks, and Print exec Arguments and Execution Time:truss –f -a -D /usr/local/sbin/snmpd
Trace a Process System Calls and Execution Time and Count it:truss -c sqlplus "/ as sysdba" << EOFexit;EOF
Trace a Process System Calls and Execution Time:truss -d sqlplus "/ as sysdba" << EOFexit;EOF
Verbosely Trace init:truss -p -v all 1
Run truss on a Command:truss -d date
Run truss to Debug Application Start:truss -rall -wall lsnrctl starttruss -aef lsnrctl dbsnmp_start
nohup /opt/tusc/bin/truss -o /tmp/syslog-ng.truss -aef /usr/local/sbin/syslog-ng --debug --foreground --stderr > syslog-ng.out 2>&1 &
grep syslog-ng.conf /tmp/syslog-ng.truss
3.5 Anlalyze Process Performance by using Caliper on HP-UX 11.xx
HP Caliper is a general-purpose performance analysis tool for applications on HP-UX and Linux systems running on HP Integrity Servers.
If it is not installed, you can download the current Caliper version 5.5: you need an AllianceONE account with appropriate privileges.
Upload the depot file on the server, gunzip it and install it:gunzip caliper.xx.xxxx.depot.gzswinstall –s caliper.xx.xxxx.depot
You can use with Caliper an initialization file (called .caliperinit), so it automatically uses this file at startup for data collection or data reporting runs. Putting the options in an initialization file simplifies the command line you use. This file is not required, but can be useful.If in the .caliperinit file the “--read-init-file” option is set to “True”, then Caliper will be used.You can find a sample initialization file in the caliper home, under the examples/startup_file/caliperinit directory: rename it to .caliperinit.Here is an example of the content:********************************************************************#Options applied to all report types.application ='myapp'arguments = '-myarg 2'context_lines = 0,3summary_cutoff = 1detail_cutoff =5source_path_map = '/proj/src,/net/dogbert/proj/src:/home/wilson/work'#Report-specific options.if caliper_config_file == 'branch':sort_by = 'taken'elif caliper_config_file == 'fprof':sort_by = 'fcount'report_details = 'statement'context_lines = 'all'# Apply an option to a subset of reports.if caliper_config_file in ("fcount"):
module_exclude = '/usr/lib/'********************************************************************
caliper uses particular measurement configuration files you can edit or create according to your needs you can find it in /opt/caliper:cd /opt/caliperls –lrt
The measurement configuration files provided with HP Caliper and the main performance measurements they take are the following:
alat measurement measures and reports sampled advance load address table (ALAT) misses
branch measurement cgprof measurement measures and reports a call graph profile, produced by
instrumenting the application code cpu measurement and per-process metrics cstack measurement cycles measurement dcache data cache measurement ecount total CPU event counts measurement fcount function call counts measurement fprof function profile measurement icache instruction cache metrics measurement scgprof measurement measures and reports (an inexact) call graph profile,
produced by sampling the PMU to determine function calls traps measurement collects and reports a profile of traps, interrupts, and faults.
fprof (flat profile) shows the parts of the process that have the highest CPU usage:caliper fprof ./binary_name
Show the parts of the process that have the highest CPU usage reporting both source and instructions (-r all) and logging the output to file:caliper fprof -o out.txt -r all
Run the default measurement, scgprof:caliper ktrace
Run functions call count measurement:caliper fcount ktrace
CPU measurement for the specified application or process:caliper cpu my_new_app
System-wide CPU measurement (log output to file):caliper cpu -w -e 120 -o cpu.txt
Measure CPU and Memory for the specified process and report:caliper cpu -o REPORT --memory-usage=all my_app
Measure CPU and system usage for the specified process and report:caliper cpu -o REPORT --system-usage=all my_app
Create a call graph profile with HP Caliper:caliper scgprof [caliper_options] program [program_arguments]
Create a report:caliper report [options]
The overview measurement enables collecting fprof, dcache, and cstack data in one single collection run:caliper overview -o rpt my_app
Collect system-wide fprof and dcache data for a duration of 300 seconds:caliper overview -w -e 300 -o rpt
Override the sampling_spec setting in pmu_trace:caliper pmu_trace -s period,variation,cpu_event program
Override the events to be measured in ecount on HP-UX:caliper ecount -m cpu_event,cpu_event program
Override the kernel stop functions and get all frames in the cstack on HP-UX:caliper cstack --stop_functions = "" program
Create a call stack profile report in the file named results.save when profiling the program enh_thr_mutex1:/opt/caliper/bin/caliper cstack -o results.save enh_thr_mutex1
To stop Caliper:kill -s INT caliper_process_ID
3.6 Analyze Process Performance by using Prospect on HP-UX 11.xx
Prospect is a performance analysis tool. On HP-UX, Prospect uses the Kernel Instrumentation (KI) tracing and Kernel Timing Clocks (KTC) package. Prospect collects data from the kernel that is only a "window of time".Prospect is available on HP-UX (PA-RISC 64-bit kernel).
If it is not installed, you can download the current Prospect version 2.6.1: you need an AllianceONE account with appropriate privileges.
Upload the depot file on the server, gunzip it and install it:gunzip prospect.xx.xxxx.depot.gzswinstall –s prospect.xx.xxxx.depot\*
You can use Prospect to profile Java applications on HP-UX.Prospect has additional features in profiling Java applications when running an HP JVM on HP-UX: to activate these features you must install HP Hotspot JVM version 1.3.1.02 or later.
Obtain symbols of JVM compiled methods in Prospect's output:prospect -V3 -foutput java -XX:+Prospect Qsort 1000000
Profile a JVM process with the specified PID:prospect -j1495 -V4 -foutput2 sleep 20
Profile the specified process or application:
prospect my_app
Verbosely profile the specified process or application:prospect -v my_proc
You can use Prospect as a statistical profiler to extract function or assembly level profiles and exact system call timings for processes of interest.
In order to use Prospect in this mode, you first need to activate KI and keep it active. This is done via the daemon mode of Prospect:prospect -P
This mode does not consume any processor resources, it is used only as a way to keep the KI trace active.
Prospect collects data over an interstice in time.Use KI and distill the output for the immediate child of Prospect (in this case, my_app), and output the summary, memory maps, profiles, and system call tables into a file called "output":prospect -V 2 -f output my_app
Outputs only information of the direct descendant child:prospect -V2 -f output1 my_app
Record all traces sampled in the time my_app ran into a binary file called "Tfile1":prospect -T Tfile1 my_app
Read the trace out of a file:prospect -t Tfile1 -f output42
Sample the kernel for 120 seconds and output the results to a file called "output”:prospect -V k -f output sleep 120
See how the kernel is performing while a specific application is running and also how that application is performing, put kernel profile in file "kern_output" and your user process profile (my_app) in file "proc_output":prospect -TTfile my_appprospect -tTfile -Vk -fkern_outputprospect -tTfile -V2 -fproc_output
Start the program to be profiled under prospect –hprof (hierarchical profile), generate a user time profile of gzip run and save the output to file:prospect --hprof --output-file=hprof.out gzip firebolt.tar
Start the program to be profiled under prospect –hprof (hierarchical profile), generate a user time profile of gzip run and save the output to file with a sampling interval of the run is 100ms:prospect --hprof --sampling-interval=100 gzip firebolt.tar
Generate a HP Caliper-like fprof reports:prospect --fprof --output-file fp.out ./qsort32
Attach a running process specified by process ID:prospect --fprof --output-file fp.out --attach=1234
Create a binary trace file:prospect --fprof --datafile=fp.cdf ./qsort32
Generate fprof report from the binary trace file:prospect --report --datafile=fp.cdf -o fp.out
Profile for a particular duration of time:prospect --fprof -o fp.out --duration=5 ./loop 10000000
Specify function summary cutoffs:prospect --fprof --summary-cutoff=,80 ./wordplay
Specify function deltails cutoff:prospect --fprof ./qsort32 (collect mode)prospect --report --detail-cutoff=,80 ./qsort32 (report mode)
Generate single report for a multithreaded application with the results of all threads aggregated together:prospect --fprof --thread=sum-all ./threadsthread
Report per-thread data for a multithreaded application:prospect --fprof --thread=all -o fp.out ./threadsthread
Report per-module data for a multithreaded application:prospect --fprof --per-module-data=TRUE --thread=all ./threadsthread
Exclude load modules:prospect --fprof --thread=all --module-exclude=/usr/lib ./threadsthread
Include load modules:prospect --fprof --thread=all --module-default=none --module-include=threadsthread ./threadsthread
Collect profile data till the processes specified terminate:prospect -V6 pid1,pid2,pid3 -f log
Collect profile data for a specified duration of time:prospect -V6 pid1,pid2,pid3 -f log sleep <duration>
Get a raw ASCII file of KI trace records:prospect -T BinTraceFile sleep 30prospect -t BinTraceFile -F AsciiTraceFile
Prospect KI kernel buffer freeing:kill <prospect -P daemon>prospect -a
Prospect KI buffer sizing:kill <prospect -P daemon>prospect -aprospect -A 4194304prospect -P
To find out how much lockable memory your system has:dmesg | grep lockable
3.7 Live Memory Analysis on HP-UX 11.xx by using KWDB
KWDB can analyze a live system to find memory leaks, performance issues and more.
Find the pathname of the currently running vmunix:kmpath
KWDB on PA requires the kernel file to be preprocessed by pxdb (change the kernel filename if it is not the standard /stand/vmunix):pxdb /stand/vmunix
Start KWDB with Q4 support to debug the kernel file, and set up the devmem target to read from /dev/mem and /dev/kmem:kwdb -q4 /stand/vmunix /dev/kmem
OR:kwdb /stand/vmunix(kwdb) target devmem(kwdb) set kwdb q4 on
OR you can also run:q4 /stand/vmunix /dev/kmem
At the q4 Promptq4> load struct utsname from &utsnameq4> print –tx
Get a listing of all the structures and typedefs that contain the string of characters “callout”:q4> cat callout
Get a listing of all the fields defined in a callout structure:q4> fields -cx struct callout
Load all the callout structures from the callout table:q4> load struct callout from callout max ncallout
List all the different flag fields in these structures:q4> print c_flag | sort –u
Keep only those callout structures with the PENDING_CALLOUT flag set:q4> keep c_flag & PENDING_CALLOUT
List all the different function addresses pointed to by these structures:q4> print -x var.real_callout.cc_func | sort -u
Get name of kernel routines found in the previous step:q4> examine 0x191a08 using aq4> ex 0x19e3e8 using aq4> ex 0x8c230 using a
Display the instructions of the functions:q4> conde unselect
Look into the near term, mid term and far future events. Load the near term callout headers and list different types from the flag fields:q4> load struct callout from callout_time_nr max ncallout until callout_time_mdq4> print c_flag | sort –u
List the absolute time fields for the headers:q4> print indexof c_abs_time_hi c_abs_time_lo
Load the mid term callout headers and print the absolute time fields for the headers:q4> load struct callout from callout_time_md max 256
Load the callout header for far future events, (there is only single header for all farfuture events) and display contents:q4> load struct callout from callout_time_ffq4> print –tx
Load the linked list of structures associated with this and print types and absolute times for each of them:q4> load struct callout from c_time_next max ncallout next c_time_nextq4> print c_abs_time_hi c_abs_time_lo c_flag
Load the hash headers and display flags, times and links:q4> load struct callout from callout_hash max 256q4> print -x flag c_abs_time_lo c_time_next c_hash_next
Load two of the expired headers and display all the fields:q4> load struct callout from callout_hash skip 256 max 2q4> print –tx
3.8 Other Debugging Tools on HP-UX 11.xx
gcore:Take a snapshot of a process:gcore –o output_filename pid
kill:Kill a process and generate its core dump:kill -SEGV <pid>
lsof:Check Open Files on the specified File System and Processes the use it:lsof /fs
Check How Many Instances of “sendmail” are Open:lsof -c sendmail
Check iNodes Usage on the specified File System:lsof –i /fs
Check Files Opened by the Specified User:lsof –u user_name
List All Open Files for the User “abe” and for the Specified Process IDs:lsof -p 456,123,789 -u 1234,abe
Find processes with open files on the NFS filesystem /nfs/mount/point whose server is inaccessible, and presuming your mount table supplies the device number for /nfs/mount/point:lsof -b /nfs/mount/pointSend a SIGHUP signal to All of the Processes that have “/u/abe/bar” Open:kill -HUP 'lsof -t /u/abe/bar'
Ignore the Device Cache File:lsof –Di
Get PID and command name field output for each process, file descriptor, file device number, and file inode number for each file of each process:lsof –FpcfDi
List the files at descriptors 1 and 3 of every process running the lsof command for login ID ''abe'' every 10 seconds:lsof -c lsof -a -d 1 -d 3 -u abe -r10
List All Files using Any Protocol on Any Port of mace.cc.org:lsof -i @mace
List All Files using Any Protocol on the Specified Port Range of mace.cc.org:lsof -i @mace:123-140
List All IPv4 Network Files in Use whose PID is 1234:lsof -i 4 -a -p 1234
fuser:Get Processes and related Username Running on the /var File System:fuser -uc /var
Get Process IDs and Login Names that have the /etc/passwd Files Open:fuser -u /etc/passwd
Get Processes Running on the Specified Device:fuser -xc /dev/hd3
Kill Processes Running on the /var File System:fuser -ku /var
Get Processes Running on the / File System and Print the Processes Name and Arguments:ps -o pid,args -p "$(fuser / 2>/dev/null)"
A Debugging Example:type midaemonfile `which midaemon`what `which midaemon`ldd `which midaemon`grep -i midaemon /etc/*
grep -i midaemon /etc/init.d/*swlist -l file | grep midaemonlsof -c midaemonps -elf | sed -n '1p; /midaem[.]*on/p;'lsof | sed -n '1p; / 17949 /p'lsof | sed -n '1p; / 17923 /p'tusc 2198strings `which midaemon` | head -n 7tail -n 30 /var/opt/perf/status.mi
4. Debug Process and Analyze Process Core on IBM AIX
4.1 Debug Processes by using proctools
Proctools are similar to Solaris ptools: see Solaris Section about ptools.
Get Process Stack Trace:procstack
Prints Pending and Held Signals for Process:procflags
Display Signal Action and Handlers for Process:procsig
Report stat and fcntl Info for All Open Files in Each Process:procfiles -n pid
Print the Current Working Directory of the Process:procwdx
Display the Process Tree:proctree
4.2 Debug Processes by using trace
The IBM AIX trace tool is conceptually similar to Linux strace.
Use trace Interactively:trace> !anycmd> q
Start trace Asynchronously:trace -a; anycmd; trcstop
Trace the System for 10 Seconds:trace -a; sleep 10; trcstop
Output Tracing Data to a Specified Log File (Instead of the Default /var/adm/ras/trcfile):trace -a -o /tmp/my_trace_log; anycmd; trcstop
Trace the Process "mydaemon" which is Currently Running:trace -A mydaemon-process-id -Pp
Trace a "cp" Command, Excluding Specific Events - in this case, lockl and unlockl functions (20e and 20f events):trace -a -k "20e,20f" -x "cp /bin/track /tmp/junk"
Trace a "cp" Command, Excluding Specific Events - in this case, lockl and unlockl functions (20e and 20f events) - and Produce a Raw Trace Output File:trace -a -k "20e,20f" -o trc_raw ; cp ../bin/track /tmp/junk ; trcstop
Trace the Hook 234 and the Hooks that will Allow to See the Process Names (in this case trace the event-grou tidhk plus hook 234):trace -a -j 234 -J tidhk
Trace Using One Set of Buffers per Processor.The Command will Produce the Files /var/adm/ras/trcfile, /var/adm/ras/trcfile-0, /var/adm/ras/trcfile-1, etc. up to /var/adm/ras/trcfile-(n-1), where n is the number of processors in the system.trace -aC all
Trace a Program that Starts a Daemon Process And Continue Tracing the Daemon after the Program:trace -X "mydaemon"
Capture PURR, PMC1 and PMC2:trace -ar "PURR PMC1 PMC2"
Format a trace Raw Output as a Report:trcrpt -O "exec=on,pid=on" trc_raw > cp.rpt
Format a trace Raw Output as a Report, Excluding the VMM Activity Detail:trcrpt -k "1b0,1b1" -O "exec=on,pid=on" trc_raw > cp.rpt2
Format a trace Output which Consists of Multiple Files:trcrpt -C all -r trace.out > trace.tr
Reading a trace Report:trace -a -k 20e,20f -o trc_raw
Filter the trace Report Searching for the Event ID for the open() System Call:trcrpt -j | grep -i open
Filter the trace Report by Checking the Event ID 15b:trcrpt -d 15b -O "exec=on" trc_raw
Filter the trace Report to Display Only the open() Subroutines:trcrpt -d 15b -p cp -O "exec=on" trc_raw
To Format a trace Output from a System as a Report on Another System, run:trcnm > trace.nm
OR Copy Also the /etc/trcfmt of the Traced System (as the Other System could have Different trace Format Stanzas):trcrpt -n trace.nm -t trcfmt_file -o newfile
And then Run trcrpt on the Other System:trcrpt -n trace.nm -o newfile
Generate CPU Report from a trace:curt -i trace.r -o outputfilecurt -i trace.raw -m trace.nm -o outputfilecurt -e -i trace.r -m trace.nm -n gensyms.out -o curt.out curt -s -i trace.r -m trace.nm -n gensyms.out -o curt.outcat curt.out
trace -n -C all -d -j 100,101,102,103,104,106,10C,134,139,200,215,419,465,47F,488,489,48A,48D,492,605,609 -L 1000000 -T 1000000 -afo trace.rawcurt -i trace.raw -n gensyms.out -o curt.outcat curt.out
Generate Input File for curt:HOOKS="100,101,102,103,104,106,10C,119,134,135,139,200,210,215,38F,419,465,47F,488,489,48A,48D,492,605,609" SIZE="1000000" export HOOKS SIZE trace -n -C all -d -j $HOOKS -L $SIZE -T $SIZE -afo trace.rawexport LIBPATH=/usr/ccs/lib/perf:$LIBPATHtrcon ; pthread.app ; trcstopunset HOOKS SIZE ls trace.raw* trace.raw trace.raw-0 trace.raw-1 trace.raw-2 trace.raw-3 trcrpt -C all -r trace.raw > trace.r rm trace.raw* ls trace* trace.r gensyms > gensyms.out trcnm > trace.nm
4.3 Debug Processes by using syscalls
The System Crashes if ipcrm -M sharedmemid is Run after syscalls has been run. Run stem -shmkill instead of Running ipcrm -M to Remove the Shared Memory Segment.
Display System Calls Count:syscalls -startsyscalls -c
Collect System Calls for a Program:syscalls -x /bin/ps
Trace a Process and Log to File:syscalls -o filename -p pid -start
Simulate the C Code Fragment:output=open("x", 401, 0755);write(output, "hello", strlen("hello"));
Runsyscall open x 401 0755 \; write \$0 hello \#hello
4.4 Debug Processes by using watch
Watch All Files Opened by the "bar" Command:watch -e FILE_Open /usr/lpp/foo/bar -x
Watch All Files Opened by the "bar" Command and Log to File:watch -e FILE_Open /usr/lpp/foo/bar -x -o output_file
Watch the Installation of the Specified Program:watch /usr/sbin/installp xyzproduct
4.5 Debug Processes by using ProbeVue
Start ProbeVue with a Script:probevue myscript.eprobevue <myscript.e
Running ProbeVue on a Program:probevue -X progname -A prog-arguments myscript Format ProbeVue Output as CSV File:probevue -X /usr/bin/tar -A "-cf /dev/null /scratch/bcobb/probevue" ./p2.e | tee t.csv
Example of ProbeVue Script to Monitor a Program:#!/usr/bin/probevuedouble engine(int p1, int p2);@@uft:$1:*:engine:entry{ printf("PID=%d TID=%d PPID=%d PGID=%d UID=%d GID=%d InKernel=%d\n", __pid, __tid, __ppid, __pgid, __uid, __euid, __kernelmode); printf("ProgName=%s errno=%d\n", __pname, __errno); printf("---\n"); stktrace(GET_USER_TRACE,20); printf("+++\n"); stktrace(PRINT_SYMBOLS|GET_USER_TRACE,20); exit;}
4.6 Debug Processes by using truss
See Solaris Section about trusstruss -deaf -o truss.out program
4.7 Debug Processes by using dbx
See Solaris and Linux Section about dbxdbx exe core
4.8 Analyze a Processes Core by using KDB
Start analyzing a Core:kdb dump
At kdb Prompt, Display Status:>stat
Initial CPU Context:>cpu 1
VMM Error Log:>vmlog
Process Info:>proc *
Get Threads:> thread *
pid Output:>p 3
4.9 Other Debugging Tool on IBM AIX
gcore:Take a snapshot of a process:gcore –o output_filename pid
kill:Kill a process and generate its core dump:kill -SEGV <pid>
Other:Get which Application Created the Core:lquerypv -h core 500 64
List Debugging Commands:bindprocessor -q
Show if 64-bit Kernel is Active:bootinfo –K
Show wether the Hardware in Use is 32-bit or 64-bit:bootinfo –y
Check libraries loaded by the specified process:ps -u sj1e652a | grep WILoginprocldd 21922
Dump a library looking for API-type exported symbols:dump -Tv bin/orb/shlib/libit_art5_xlc50.so 2>&1dump -Ctv E652/bin/WIReportServer
ps -u sj1e652a | grep WILoginprocldd 21922dump -Tv bin/orb/shlib/libit_art5_xlc50.so 2>&1| grep EXP | c++filt | moretruss -t!all -s!all -u libit_*::CORBA* -p 21922
dump -Ctv E652/bin/WIReportServer | grep FUNC.*GLOB.*9.*dgWICDZ_ips -u sj2e652s -o pid,args | grep WIReportServertruss -t!all -s!all -u a.out::*dgWICDZ_* -p 18846 2>&1 | tee -a out.txtcat out.txt | c++filtpldd 18846truss -t!all -s!all -u libclntsh -p 18846 2>&1 | tee -a out.txt
dump -Hlddprocfileslockstat -IWk example_tnf 24
InterProcess Communication Facilities:ipcs
System Attributes (Entries Marked as "True" are Configurable)lsattr -l sys0 -E
Changes the High/Low water marks for Pending Write I/Os per File:lsattr -l sys0 -a maxpout=9 -a minpout=6
Process Profiling:pprof
Paging Space Statistics:pstat -s
System Variables:pstat -T
Paging Statistics:lsps -a
Display Path Name from iNode Number:ncheck - i <inode>
List Files and grep for the iNode:ls -ail |grep <inode>
Report Placement of File Blocks:fileplace -pv /unix
Monitor Activity at All FileSystem Levels and Write the Results to /tmp/filemon.log:filemon -o /tmp/filemon.log -O alltrcstop
CPU Profile:tprof
CPU Usage Statistics:netpmon -o /tmp/netpmon.log -O alltrcstop
dkvisnfsvissystatmpvisdkstat
5. Debug Process and Analyze Process Core on IRIX
gcore:Take a snapshot of a process:gcore –o output_filename pid
kill:Kill a process and generate its core dump:kill -SEGV <pid>
parprfstatSystemTaplockstat -IWk example_tnf 24
6. Debug Process and Analyze Process Core on Tru64
gcore:Take a snapshot of a process:gcore –o output_filename pid
kill:Kill a process and generate its core dump:kill -SEGV <pid>
trace trussatom -tool ptrace
odump -Dllddlockstat -IWk example_tnf 24lockinfo
7. Generate / Analyze a Crash Dump on Solaris
7.1 Save a Crash Dump on a Panic’d System
Check if savecore is Enabled:/etc/init.d/sysetup
Get Core Dump (or Crash Dump) Configurationcoreadm
Save a Crash Dump of the Running Solaris System (without actually rebooting or altering the system):savecore -Lv
OR:savecore -d
Save a Crash Dump (Rebooting the System):reboot -d
OR:uadmin 5 #
OR generate a system panic:adb -k -w /dev/ksyms /dev/mem -> rootdir/W 0 -> ls /
If after enabling core file generation your system still does not create a core file, you may need to change the file-size writing limits set by your operating system:ulimit -aulimit -c unlimitedulimit -H -c unlimited
Check Generated Core Dump on Solaris:ls -lrt /var/crash/sunbkl01cd /var/crash/sunbkl01pstack vmcorefile vmcorestrings vmcore
7.2 Setup a System to Save a Crash Dump
Disable / Enable the Saving if Crash Dumps:
dumpadm -ndumpadm -y
Enable Compressed Crash Dump (Default):dumpadm -z on -y
Enable Uncompressed Crash Dump (it Uses Mush Space):dumpadm -z off -y
Check dumpadm Configurationmore /etc/dumpadm.confdumpadm
Setup System for Full Crash Dump: dumpadm -c all -d /dev/md/dsk/d201 -s /var/crash/vasdbs02
OR Setup System for Dumping Kernel Memory Pages Only (it Saves Space and Time, but it's Less Accurate and Less Useful for Debugging a Problem):dumpadm -c kernel -d /dev/md/dsk/d201 -s /var/crash/vasdbs02
OR Setup System for Dumping Kernel Memory Pages, and the Memory Pages of the Process whose Threads was Currently Executing on the CPU on which the Crash Dump was Initiated. If the Thread executing on that CPU is a Kernel Thread Not Associated with any User Process, Only Kernel Pages will be Dumped:dumpadm -c curproc -d /dev/md/dsk/d201 -s /var/crash/vasdbs02more /etc/dumpadm.conf
Reconfigure the Dedicated Dump Device and Directory on which Crash Dumps will be Saved:dumpadm -d /dev/dsk/c0t2d0s2 -s /var/crash/server_name
OR (on a System using SVM):dumpadm -d /dev/md/dsk/d201 -s /var/crash/vasdbs02
OR Reconfigure the Dedicated Dump Device on Swap:dumpadm -d swap
Restart Dumpadm Service and Check:svcadm restart svc:/system/dumpadm:defaultsvcs -a | grep -i dumpadm
To setup a method to automatically save crash dump on the older versions of Solaris OS, or on servers where dumpadm is not installed, you can create a script /etc/init.d/sysetup with the following content:if [ ! -d /var/crash/´uname -n´ ] then mkdir -p /var/crash/´uname -n´fiecho 'checking for crash dump...\c ' savecore /var/crash/´uname -n´echo ' '
7.3 Crash Dump Analysis on Solaris by using MDB
The Solaris Modular Debugger is a powerful debugger that replaces the adb and crash utilities that you can still find on Solaris systems beside the mdb.
Access the the crash dump directory and check files:cd /var/crash/syste_namels –lrtpstack vmcorefile vmcorestrings vmcore
Invoke the mdb Debugger:mdb -k unix.0 vmcore.0
OR:mdb -k 0
At mdb PromptGet the time of the crash:*time-(*lbolt%0t100)=Y::time/Y
Get the core informations:::coreinfo
Get crash informations:::system::status
Display the panic string:::panicinfo
Display the stack trace:::stack
Display the message buffer (containing the panic string):::msgbuf
Display the crash log:::crashlog
Get CPU informations at the time of the crash:::cpuinfo –v
Get semaphores informations at the time of the crash:::ipcs::dnlc
Display the thread list at the time of the crash:::threadlist::tlist killed::tlist pctcpu
Show the kernel memory structures and the kernel memory log:::kmastat::kmalog::ksemid::kshmid::kstat xck filename::mdump/rd\* -P::nvlist::slist
Show memory informations at the moment of the crash:::memstat::memerr::meminfo tree process::meminfo user command::meminfo -m user
Show symbols and processes informations (including the processes tree):::nm::symbols::ps::ps -z::pgrep processname::ptree::proc
Show open files at the moment of the crash:::pfiles
Display the callouts and the memory walkers:::callout::walkers
Display the CPU cycle informations:::cycinfo –v
Display disk, slices, partitions table, SVM and ZFS informations:::vfstab::svm -i::svm [-s <set>] [-d <devnum>]::zfs
Get pools informations:::pool
Get NFS and shared filesystems informations:::nfs::autofs
Get file lists at the time of the crash:::findfiles
Get cluster informations:::clust
Get zone informations:::zone
Get informations about the previously selected structure:::whatis –P
Get network interfaces informations at the time of the crash:::ifconf::netstat
Display memory dump informations:::pkma -fslL::scatenv mdump_compression
Get the alternate CPU walk and follow it:::scatenv alternate_cpu_walkffffffffaaaf8760::whatis30018ca2d20::print -t kthread_t2a101423cc0::findstack300423b2000::cpuinfo -v::walk thread::walk thread |::findstack::walk cpu |::print cpu_t cpu_thread |::print kthread_t t_pri0x3000b270078::print -t proc_t p_user.u_psargscpu0::print cpu_t cpu_disp |::print disp_t
First or Second Address in pstackff21fca4::dis
Second Address in pstack0003cb08::nmadd -f badfunc
Second Address in pstack and End Address in pstack:0003cb08::nmadd -f -e 00020dc0 badfunc0003cb08::dis
Get the registers informations:::regs
Display memory leaks and walk the kernel memory log to find leaks:::findleaks::walk kmem_log | ::bufctl ! grep tleakd4db0300::whatis0x0000000010035a94::what is -av::walk kmem_log | ::bufctl -a d4db0300d4db0300::kgrep | ::whatis -av80506c0::nmadd -f -e 80506da badfunc
Quit the debugger:$q
An alternate method to invoke the debugger is to pass echoed commands by pipe:echo "*panicstr/s" |mdb -k unix.0 vmcore.0echo "*cmm_dbg_buf/s" |mdb -k unix.0 vmcore.0 > ./cmm_dbg_buf.outecho "$<threadlist" |mdb -k unix.0 vmcore.0 > ./threadlist.out
OR:fmdump -v
7.4 Service Tool Bundle Service Crash Analysis Tool
Download the Oracle Solaris Service Tool Bundle from the support.oracle.com web portal.
Untar the Package, Access the Directory and Install Service Tool Bundle and Choose the Components to Install (you must Select Service Crash Analysis Tool):./install_stb.sh
Execute the Service Crash Analysis Tool (scat):cd /var/crash/cldbrm2a/opt/SUNWscat/bin/scat --scat_explore -a -v unix.1 vmcore.1
OR:/opt/SUNWscat/bin/scat --scat_explore -a -v 1
Access the Directory Created by scat and Analyze the Files:cd $SCAT_EXPLORE_DATA_DIRmore panic.outmore panic_thread.outmore panic_buf.outmore analyze.outmore coreinfo.outmore cpu-L.outmore dev
An alternate method to use SCAT is to access its Prompt:/opt/SUNWscat/bin/scat 0
Then, at the scat Prompt, analyze the crash dump:SolarisCAT(vmcore.1/11X)> analyze
Get the thread list:SolarisCAT(vmcore.1/11X)> threadlist
Get CPU informations:SolarisCAT(vmcore.1/11X)> cpuinfo -v
Get kernel tunables:SolarisCAT(vmcore.1/11X)> tunables
Get the dispatch queues:SolarisCAT(vmcore.1/11X)> dispq
Get ZFS informations:SolarisCAT(vmcore.1/11X)> zfs –e
Get ZFS informations:SolarisCAT(vmcore.1/11X)> zfs arc
Run Sanity Checks:scat --sanity_checks vmcore.0
scat can Include an optional module to retrieve the type information from.
List Modules:ctf
Dump qlc logs (fp_logq or ssfcp_logq):qlcfc fplog|ssfcplog
Simplify Decoding ddi_devid_t (impl_devid_t) Structures in the Kernel and Display the String Representation of the devid:dev id
Display the Threads that have an Affinity set for a CPU (Specify <cpu> to Show only Threads with Affinity for that <cpu>): tlist affinitiy <cpu>
7.5 Crash Dump Analysis on Solaris by using ADB
Access the the crash dump directory and check files:cd /var/crash/syste_namels –lrtpstack vmcorefile vmcorestrings vmcore
Invoke the debugger:adb -k unix.0 vmcore.0
OR:adb –k 0
At adb PromptDisplay the message buffer:$<msgbufmsgbuf+14smsgbuf+10/s
Get the core informations:$>coreinfo
Get crash informations:$>system$>status
Display the panic string:$>panicinfo*panicstr/s
Show the crash log:$>crashlog
Get the thread list:$< threadlist
Check the status:$>status
Get the system crash time:$>time/Y
Get the boot time:$>lbolt/X
Get server’s informations:$<utsname$<hw_provider/s$<architecture/s$<srpc_domain/s
Display stack trace:$<stack$<stacktrace
Display stack calls:$<stackcalls
Display stack registers:$<stackregs
Stack traceback:<sp$<stacktrace
Check the root device:rootfs$<bootobj
Check the swapfile device:swapfile$<bootobj dumpfile$<bootobj
Display the CPU cycle informations:$>cpuinfo –v
Get CPUs:$<cpus
Get process on CPU:$<proconcpu
Get processes running at the moment of the crash:$<proc
Get modules:
$<modules
Show open files at the moment of the crash:::pfiles
Display the callouts and the memory walkers:$>callout$>walkers
Get the kernel memory structures:$>kmastat
Show memory informations at the moment of the crash:$>memstat$>memerr$>meminfo tree process
Show kernel memory segments:$<seglist
Get ipc informations:ipcaccess/10i
Get segment map:$>segkmap/J
Show kernel address space:$>kas
Show queues:$<queue
Get filesystem list:$<vfslist
Quit the debugger:$>q
An alternate method to invoke the debugger is to pass echoed commands by pipe:echo 'msgbuf$<msgbuf' | adb -k unix.0 vmcore.0echo 'msgbuf,100/s' | adb -k unix.0 vmcore.0echo '$c' | adb -k unix.0 vmcore.0echo "<fp$<stackcalls" | adb -k unix.0 vmcore.0echo "<fp$<stack" | adb -k unix.0 vmcore.0echo "<fp$<stackregs" | adb -k unix.0 vmcore.0echo "<fp$<stacktrace" | adb -k unix.0 vmcore.0
7.6 Crash Dump Analysis on Solaris by using Crash
The crash tool is installed as part of the Solaris operating system.The binary is located in the /usr/sbin.
Access the the crash dump directory and check files:cd /var/crash/syste_namels –lrtpstack vmcorefile vmcorestrings vmcore
Invoke crash tool and output to file:crash –d vmcore.0 -n unix.0 –w /tmp/output_filename
Invoke crash tool to use it by the prompt:crash –d vmcore.0 -n unix.0
OR:crash –d 0
At the crash Prompt
Get the core informations:>coreinfo
Get crash informations:>system>status
Display the panic string:>panicinfo*panicstr/s
Show the crash log:>crashlog
Show processes running at the moment of the crash:>proc>p -e>p -l
Get the thread list at the moment of the crash:>threadlist
Check the status:>status
Get CPU informations at the moment of the crash:>cpuinfo>cpuinfo -v
Show the buffer:>buf
Show the queues:>queue
Get kernel memory structures informations at the time of the crash:>kmastat
Get memory informations at the time of the crash:>meminfo>memerr
Quit the crash tool: <CTRL><D>
7.7 Crash Dump Analysis on Solaris by using ACT
The ACT tool analyzes a system kernel dump and generates a human-readable text summary.It’s shipped with all the Solaris media installation or with the Service Tool Bundle.To check if it is installed:pkginfo | grep CTEact
Access the the crash dump directory and check files:cd /var/crash/syste_namels –lrtpstack vmcorefile vmcorestrings vmcore
To invoke ACT and output core file to seperate files in /tmp/dir:act -d /var/crash/hostname/vmcore.0 -s /tmp/dir/
OR to invoke ACT and output core file to act_out file:act -d /var/crash/hostname/vmcore.0 > /tmp/act_out
OR to invoke ACT and output on live server to screen:act –l
When ACT is invoked to split the core file into the specified directory it creates the following files:biowaitgetblkmodulesmsgbufmutexrwlockthreadssystemsummarysunsolve
7.8 Other Crash Dump Analysis Tools on Solaris
On Solaris you can use some common binaries and commands to analyze a crash dump.
Get network status at the time of the crash:
netstat unix.0 vmcore.0
Get NFS status at the time of the crash:nfsstat -n unix.0 vmcore.0
Get ARP tables at the time of the crash:arp -a unix.0 vmcore.0
Get IPC status at the time of the crash:/usr/sbin/ipcs –C vmcore.0 unix.0
8. Generate / Analyze a Crash Dump on HP-UX
8.1 Crash Dump Analysis by using KWDB
Check the Crash Dump Directory:ls -lrt /var/adm/crash/c*
Check the INDEX file and /etc/shutdownlog file as they contain the "panic" statement.cat INDEXcat /etc/shutdownlog
Create the /etc/shutdownlog file if it does not exist:touch /etc/shutdownlog
If there's No Dump, Re-Save it:savecrash -vr /tmp
Verify that kwdb (preferred) or q4 is Installed and Loaded:swlist -l fileset | grep -i KWDBswlist -l file | grep contribswlist -l fileset | grep -i q4
If KWDB is not installed, you can download the HP official depot for your server’s HP-UX version and architecture from here (you need an HP AllianceONE account with appropriate privileges).Then upload the depot to the server and uncompress it:gunzip KWDB_3.xxxx_depot.gz
Install the depot package for Itanium-based & PA-RISC systems: swinstall -s /KWDB_3.tape_depot KWDB_3
OR for PA-RISC system:swinstall -s /kwdb.pa.depot KWDBPA_3
Analyze the Crash Dump by Using kwdb:cd /var/adm/crash/crash.#ls –lrtkwdb -q4 /var/adm/crash/crash.5
At the kwdb PromptCheck the panicstring:(kwdb) examine panicstr using s
Display stack trace with pc and sp (PA-RISC only):(kwdb) pc sp
Get breakpoint info:(kwdb) info breakpoints(kwdb) i b
Trace event 0:(kwdb) trace event 0
Trace event 0 with input, local and output registers:(kwdb) trace -args event 0
Load structures:(kwdb) load struct utsname from &utsname(kwdb) print –t
Print console message buffer:(kwdb) examine &msgbuf+8 using s
Print the system crash date/time:(kwdb) examine &time using Y
How long had the system been up before the crash:(kwdb) ticks_since_boot/hz
System load average at the moment of the crash:(kwdb) examine &avenrun using 3F(kwdb) examine &real_run using 3F
What command was running the specified process:(kwdb) load struct proc from 0xb0d240(kwdb) examine p_cmnd using s(kwdb) load struct proc from 0x42234040(kwdb) print -xt p_cmnd(kwdb) examine 0x41e4db40(kwdb) print p_comm
How was the kernel built:(kwdb) examine &_makefile_cflags using s
Load the part of the crash event table that contains valid entries and trace them:(kwdb) load crash_event_t from &crash_event_table until crash_event_ptr max 100loaded 4 struct crash_event_table_structs as an array (stopped by “until” clause)(kwdb) trace pile
Load the processor info table and trace every processor (HP-UX v11.11):(kwdb) load mpinfo_t from mpproc_info max nmpinfoloaded 4 struct mpinfos as an array (stopped by max count)(kwdb) trace pile
OR (post-HP-UX v11.11 kernels):(kwdb) load mpinfou_t from &spu_info max nmpinfo(kwdb) pileon mpinfo_t from pikptr(kwdb) trace pile
Load the processor information table and trace every processor:(kwdb) load mpinfou_t from &spu_info max nmpinfoloaded 1 union mpinfou as an array (stopped by max count)(kwdb) pileon mpinfo_t from pikptrloaded 1 struct mpinfo(kwdb) trace pile
Load the process table and trace the stacks:(kwdb) load struct proc from proc_list max nproc next(kwdb) trace pile
Load crash event:(kwdb) load crash_event_t from &crash_event_table until crash_event_ptr max 100(kwdb) print cet_hpa %#x cet_event
Trace event 1:(kwdb) trace event 1
Trace event 0 with input, local and output registers:(kwdb) trace -args event 0
Load structures:(kwdb) load struct utsname from &utsname(kwdb) print –t
Check threads:(kwdb) load kthread_t from kthread max nkthread(kwdb) hist(kwdb) load kthread_t from kthread_list max nkthread next kt_factp(kwdb) hist(kwdb) keep kt_cntxt_flags & TSRUNPROC
Display stack trace for structures from the current pile for process, processor, thread and crash event structures:(kwdb) trace pile(kwdb) print -tx kt_stat kt_cntxt_flags kt_flag kt_spu addrof kt_procp(kwdb) addrof kt_procp
Check running processes (at the time the panic occurred):(kwdb) runningprocs
Display stack trace for the process at addr:(kwdb) trace process at 7032300014
Trace CPU3, its threads, spinlocks, calls, etc…:(kwdb) trace .v processor 3
Check the state of the processors:(kwdb) load mpinfo_t from mpproc_info max nmpinfo(kwdb) load mpinfou_t from &spu_info max nmpinfo
(kwdb) pileon mpinfo_t from pikptr(kwdb) call it mpinfo(kwdb) print indexof addrof threadp curstate(kwdb) exam &mp_avenrun for nmpinfo using 3F(kwdb) print indexof addrof held_spinlock spinlock_depth(kwdb) load lock_t from 0x129a4c0(kwdb) print -x sl_owner sl_lock_caller sl_unlock_caller(kwdb) exam sl_lock_caller using a(kwdb) exam sl_unlock_caller using a
Recall mpinfo (Make a pile which is specified by mpinfo):(kwdb) recall mpinfo(kwdb) print indexof spu_state(kwdb) print indexof last_idletime last_tsharetime(kwdb) lbolt(kwdb) recall mpinfo(kwdb) print mp_rq.nready_free mp_rq.nready_locked
Check the per-processor run queues:(kwdb) print -t | grep mp_rq(kwdb) print -t | grep mp_rq > mprq.out(kwdb) load rtsched_info_t from &rtsched_info(kwdb) print rts_nready rts_bestq rts_qp rts_numpri(kwdb) print -t(kwdb) print addrof kt_lastrun_time kt_wchan | sort -k 3n,3 | uniq -c -f2 | grep -v “^ 1” | sort
Trace the specified thread:(kwdb) trace thread at 1532338064(kwdb) load unwindDesc_t from &$UNWIND_START until &$UNWIND_END max 100000(kwdb) maint info unwind panic(kwdb) examine &_makefile_cflags using s
Check kernel memory writes and log:(kwdb) kmem_writes(kwdb) load kmem_log_t from &kmem_log max kmem_log_slots
If the crash dump analysis reveal an hardware issue, you can find the associated tombstone for the system.To save a tombstone:/usr/sbin/diag/contrib/pdcinfo
Check the tombstone:cd /var/tombstones/ls –lrtmore ts99
Extract the PIM informations:cstmcstm>mapcstm>sel dev 25cstm>infocstm>infologEnter Done, Help, Print, SaveAs, or View: [Done] SA
cstm>quitls -l /tmp/pim.HPMC.16Nov03
8.2 Remote Crash Dump Analysis
kwdbcr -helpkwdbcr /var/adm/crash.5
kwdb -q4 [-m] vmunix remote_system:port_number | crash_path in remote system>kwdb -q4 [-m] vmunix (kwdb) target crash remote_system:port_number | crash_path in remote system>
more /var/opt/kwdb/kwdbcr.logkwdbcr -d -l logfile
8.3 Crash Dump Analysis by using Q4
Q4 is a crash dump analysis tool shipped with HP-UX OS installation media.It can work alone or in combination with KWDB.
Check the Crash Dump Directory:ls -lrt /var/adm/crash/*
Check the INDEX file and /etc/shutdownlog file as they contain the "panic" statement.cat INDEXcat /etc/shutdownlog
Create the /etc/shutdownlog file if it does not exist:touch /etc/shutdownlog
If there's No Dump, Re-Save it:savecrash -vr /tmp
Verify that kwdb (preferred) or q4 is Installed and loaded:swlist -l fileset | grep -i KWDBswlist -l fileset | grep -i q4swlist -l file | grep contribtype q4
If Q4 is not installed, you can install it from the HP-UX INSTALL media.First, you have to check the following patches are installed on the corresponding OS versions:HP-UX v10.20: PHCO_20261HP-UX v11.00: PHCO_20262HP-UX v11.11: PHCO_25723
To check wether the patch is installed, run the following command by substituting the “xxxxx” with the ID of the corresponding patch you’re searching for:/usr/sbin/swlist -l product | grep PHCO_xxxxx
If needed, you can download the patch from the following locations:For v10.[12]0 versions: ftp://us-ffs.external.hp.com/hp-ux_patches/s700_800/10.X/PHCO_20261
For v11.0 versions: ftp://us-ffs.external.hp.com/hp-ux_patches/s700_800/11.X/PHCO_20262
For v11.11 versions: ftp://us-ffs.external.hp.com/hp-ux_patches/s700_800/11.X/PHCO_25723
swlist -l fileset -s /cdrom | grep Q4OS-Core.Q4 B.10.10 HP-UX Crash Dump Debugger for PA-RISC systems
Select and load it if not loaded:swinstall -vs /<CD-ROM mount point> OS-Core.Q4
Prepare dump toolsFor HP-UX 10.20 through 11.11:
/usr/contrib/bin/q4prep –p
For HP-UX 11.20 and above:/usr/contrib/Q4/bin/q4prep –p
For HP-UX 10.10Uncompress and untar the Q4Lib:uncompress /usr/contrib/Q4/lib/Q4Lib.tar.Ztar -xf /usr/contrib/Q4/lib/Q4Lib.tar
Copy the q4rc.pl sample file to the /tmp:cp /usr/contrib/Q4/lib/q4lib/sample.q4rc.pl /tmp/.q4rc.pl
Once the dump tools are installed and prepared, access the crash dump directory and decompress the dump:cd /var/adm/crash/crash.5ls –lrtgunzip vmunixstrings vmunix | morefile vmunix
Set the Environment:. /usr/contrib/Q4/bin/set_env
Make a check of vmunix (for HP-UX 11.20 and above):/usr/contrib/Q4/bin/q4pxdb vmunix –s status vmunixpxdb -s status ./vmunix
OR (for HP-UX 10.20 through 11.11): /usr/contrib/bin/q4pxdb –s status vmunix
Start analyzing the dump (for HP-UX 11.20 and above):/usr/contrib/Q4/bin/q4pxdb vmunix
OR (for HP-UX 10.20 through 11.11): /usr/contrib/bin/q4pxdb vmunix
Get panic info and put the output on a file:last reboot > reboot.out
Get installed patches list and put it on a file:swlist -l product | grep -i PH > patches.out
Access the core directory:cd /var/adm/crash/core.0ls –lrt
Analyze the Crash Dump by Using Q4 (for HP-UX 11.20 and above):/usr/contrib/Q4/bin/q4 -p .
OR (for HP-UX 10.20 through 11.11): /usr/contrib/bin/q4 -p .
At the q4 Promptinclude the analyze.pl script to add more analyzing features:q4> include analyze.pl
Analyze the dump and put the output in a file:q4> run Analyze AU > ana.out
Check the panic cause and put the output in a file:q4> run WhatHappened > what.out
If it happened an hang, check the hang cause and put the output in a file:q4> run WhatHappened -HANG > whath.out
Exit q4 Prompt:q4> exit
If the crash dump analysis reveal an hardware issue, you can find the associated tombstone for the system.To save a tombstone:/usr/sbin/diag/contrib/pdcinfo
Check the tombstone:cd /var/tombstones/ls –lrtmore ts99
Extract the PIM informations:cstmcstm>mapcstm>sel dev 25cstm>infocstm>infologEnter Done, Help, Print, SaveAs, or View: [Done] SAcstm>quitls -l /tmp/pim.HPMC.16Nov03
Then analyze the Following Files:more patches.outmore /etc/shutdownlog
more /var/tombstones/ts* (if they exist and/or if HPMC was detected)more /var/adm/syslog/OLDsyslog.log (if the dump was due to a hang)more ana.outmore what.outmore whath.outmore reboot.outmore crashinfo.out
8.4 Crash Dump Analysis by using KWDB Q4 Mode
KWDB supports a superset of commands provided by the crash dump analysis tool, Q4, which extends its functionalities.
Check the Crash Dump Directory:ls -lrt /var/adm/crash/*
Check the INDEX file and /etc/shutdownlog file as they contain the "panic" statement.cat INDEXcat /etc/shutdownlog
Create the /etc/shutdownlog file if it does not exist:touch /etc/shutdownlog
If there's No Dump, Re-Save it:savecrash -vr /tmp
Verify that kwdb (preferred) or q4 is Installed and loaded:swlist -l fileset | grep -i KWDBswlist -l fileset | grep -i q4swlist -l file | grep contribtype q4
If KWDB is not installed, you can download the HP official depot for your server’s HP-UX version and architecture from here (you need an HP AllianceONE account with appropriate privileges).Then upload the depot to the server and uncompress it:gunzip KWDB_3.xxxx_depot.gz
Install the depot package for Itanium-based & PA-RISC systems: swinstall -s /KWDB_3.tape_depot KWDB_3
OR for PA-RISC system:swinstall -s /kwdb.pa.depot KWDBPA_3
If Q4 is not installed, then follow the indications in the above section 8.3.
Access the crash dump directory and analyze the Crash Dump by Using kwdb / q4:cd /var/adm/crash/crash.#ls –lrtkwdb -q4 -p -m .
OR at the kwdb Prompt, Activate the q4 Mode :
(kwdb) set kwdb q4 on
You run “set kwdb q4 off” at the q4 Prompt to disable q4 support.
At the q4 PromptCheck the events that occurred immediately before and during the panic and log to file:q4> run WhatHappened > what.out
If there’s a suspect an hang occurred, check the panic events by running:q4> run WhatHappened -HANG > whath.out
Analyze the dump and log output to file:q4> run Analyze AU > ana.out
Check the panicstring:q4> examine panicstr using s
Display stack trace with pc and sp (PA-RISC only):q4> pc sp
Get breakpoint info:q4> info breakpointsq4> i b
Trace event 0:q4> trace event 0
Trace event 0 with input, local and output registers:q4> trace -args event 0
Load structures:q4> load struct utsname from &utsnameq4> print –t
Print console message buffer:q4> examine &msgbuf+8 using s
Print the system crash date/time:q4> examine &time using Y
How long had the system been up before the crash:q4> ticks_since_boot/hz
System load average at the moment of the crash:q4> examine &avenrun using 3Fq4> examine &real_run using 3F
What command was running the specified process:q4> load struct proc from 0xb0d240q4> examine p_cmnd using sq4> load struct proc from 0x42234040q4> print -xt p_cmndq4> examine 0x41e4db40q4> print p_comm
How was the kernel built:q4> examine &_makefile_cflags using s
Load the part of the crash event table that contains valid entries and trace them:q4> load crash_event_t from &crash_event_table until crash_event_ptr max 100loaded 4 struct crash_event_table_structs as an array (stopped by “until” clause)q4> trace pile
Load the processor info table and trace every processor (HP-UX v11.11):q4> load mpinfo_t from mpproc_info max nmpinfoloaded 4 struct mpinfos as an array (stopped by max count)q4> trace pile
OR (post-HP-UX v11.11 kernels):q4> load mpinfou_t from &spu_info max nmpinfoq4> pileon mpinfo_t from pikptrq4> trace pile
Load the processor information table and trace every processor:q4> load mpinfou_t from &spu_info max nmpinfoloaded 1 union mpinfou as an array (stopped by max count)q4> pileon mpinfo_t from pikptrloaded 1 struct mpinfoq4> trace pile
Load the process table and trace the stacks:q4> load struct proc from proc_list max nproc nextq4> trace pile
Load crash event:q4> load crash_event_t from &crash_event_table until crash_event_ptr max 100q4> print cet_hpa %#x cet_event
Trace event 1:q4> trace event 1
Load structures:q4> load struct utsname from &utsnameq4> print –t
Check threads:q4> load kthread_t from kthread max nkthreadq4> histq4> load kthread_t from kthread_list max nkthread next kt_factpq4> histq4> keep kt_cntxt_flags & TSRUNPROC
Display stack trace for structures from the current pile for process, processor, thread and crash event structures:q4> trace pileq4> print -tx kt_stat kt_cntxt_flags kt_flag kt_spu addrof kt_procpq4> addrof kt_procp
Check running processes (at the time the panic occurred):q4> runningprocs
Display stack trace for the process at addr:q4> trace process at 0x41978040
Trace CPU3, its threads, spinlocks, calls, etc…:q4> trace –v processor 3
Check the state of the processors:q4> load mpinfo_t from mpproc_info max nmpinfo
Recall mpinfo (Make a pile which is specified by mpinfo):q4> recall mpinfoq4> print indexof spu_stateq4> print indexof last_idletime last_tsharetimeq4> lboltq4> recall mpinfoq4> print mp_rq.nready_free mp_rq.nready_locked
Check the per-processor run queues:q4> print -t | grep mp_rq > mprq.outq4> load rtsched_info_t from &rtsched_infoq4> print rts_nready rts_bestq rts_qp rts_numpriq4> print -tq4> print addrof kt_lastrun_time kt_wchan | sort -k 3n,3 | uniq -c -f2 | grep -v “^ 1” | sort
Trace the specified thread:q4> trace thread at 1532338064q4> load unwindDesc_t from &$UNWIND_START until &$UNWIND_END max 100000q4> maint info unwind panicq4> examine &_makefile_cflags using s
Check kernel memory writes and log:q4> kmem_writesq4> load kmem_log_t from &kmem_log max kmem_log_slots
Exit the q4 Prompt:q4> exit
Run the crashinfo utility, if you have it. It may be in /usr/local/bin or /opt/sfm/tools/ - search it if you don’t find it:find / -type f | grep crashinfo
Run crashinfo and log the output to file:/opt/sfm/tools/crashinfo > crashinfo.out
OR:/usr/local/bin/crashinfo > crashinfo.out
OR:/opt/sfm/tools/crashinfo -continue | tee crash-43.log
OR pointing the crash .# file:/opt/sfm/tools/crashinfo /var/adm/crash/crash.5 > crashinfo.out
If the crash dump analysis reveal an hardware issue, you can find the associated tombstone for the system.To save a tombstone:/usr/sbin/diag/contrib/pdcinfo
Check the tombstone:cd /var/tombstones/ls –lrtmore ts99
Extract the PIM informations:cstmcstm>mapcstm>sel dev 25cstm>infocstm>infologEnter Done, Help, Print, SaveAs, or View: [Done] SAcstm>quitls -lrt /tmp/pim.HPMC.16Nov03
Then analyze the Following Files:more patches.outmore /etc/shutdownlogmore /var/tombstones/ts* (if they exist and/or if HPMC was detected)more /var/adm/syslog/OLDsyslog.log (if the dump was due to a hang)more ana.outmore what.outmore whath.outmore reboot.outmore crashinfo.out
To install crashinfo:
crashinfo is part of the SFM (System Fault Management) bundle.There are 2 versions of crashinfo – crashinfo-a-2.exe (64-bit PA2.0) and crashinfo-a-i.exe (IA64). The 64-bit PA2.0 version can be run on both PA and IA64 systems, and analyze both PA2.0 and IA64 crashdumps. The IA64 version will only run on IA64 systems, but can analyze crashdumps from both IA64 and PA2.0 systems. For performance reasons you may wish to use the IA64 version when running on IA64 systems.
Check if crashinfo is installed on the system:ls -lrt /opt/sfm/tools
Download crashinfo /var/adm/crash/depot/SFM-CORE/MISC_TOOLS/opt/sfm/tools/crashinfo-a-2.exe
OR:/opt/sfm/tools/crashinfo-a-i.exe
To run crashinfo:/opt/sfm/tools/crashinfo > crashinfo.out
/usr/ccs/bin/pxdb -s status ./vmunix
/usr/ccs/bin/pxdb ./vmunix
8.5 Crash Dump Analysis by using HP WDB / GDB
The HP Wildebeest Debugger (WDB) is an HP-supported implementation of the Open Source GNU debugger (GDB).
HP WDB / GDB can be used to debug / monitor a process, but it mostly used to analyze crashed processes core files and system’s crash dumps.
To analyze a system crash dump follow the steps below.
Check the Crash Dump Directory:ls -lrt /var/adm/crash/c*
Check the INDEX file and /etc/shutdownlog file as they contain the "panic" statement.cat INDEXcat /etc/shutdownlog
Create the /etc/shutdownlog file if it does not exist:touch /etc/shutdownlog
If there's No Dump, Re-Save it:savecrash -vr /tmp
Check if HP WDB is installed:swlist -l fileset | grep -i wdb
If HP WDB is not installed, you can download the latest version (6.3) for your HP-UX version and architecture from here: you need an HP AllianceONE account with appropriate provileges.
Upload the depot file onto the server’s /tmp directory, access the directory and decompress it:cd /tmp gunzip hpwdb.xxxx.xxxx.depot.gz
Install the depot:swinstall –s hpwdb.xxxx.xxxx.depot/*
The main path are:/opt/langtools/wdb/opt/langtools/gdb/opt/langtools/bin
Before analyzing a process core file, check it:file corefile_namestrings corefile_name
Check if it’s truncated:elfdump -o -S core
If the core file is truncated at 2GB size, maybe the system does not support creating files over that size on the filesystem on which the crash dumps are saved. You can check the syslog file to see if the system warns it could not complete saving the file.If that’s the problem, the you can enable support for files over 2GB for the specified filesystem:fsadm -o largefiles /filesystem_name
To start analyze the core dump:gdb -c core
OR:gdb
At gdb Prompt:(gdb) core core
For commands and details refer to section “3.3 Debug Processes and Core Files by using HP WDB / GDB”.
8.6 Crash Dump Analysis by using adb
Check the Crash Dump Directory:ls -lrt /var/adm/crash/c*
Check the INDEX file and /etc/shutdownlog file as they contain the "panic" statement.cat INDEXcat /etc/shutdownlog
Create the /etc/shutdownlog file if it does not exist:touch /etc/shutdownlog
If there's No Dump, Re-Save it:savecrash -vr /tmp
Access the crash dump directory and start analyzing the dump (change crash.5 with the name of the crash.# directory):cd /var/adm/crash/crash.5ls –lrtgunzip vmunix.gzstrings vmunix | morefile vmunixadb -m vmunix .
OR without accessing the crash dump directory:adb -m /var/adm/crash/crash.5/vmunix /var/adm/crash/crash.5msgbuf+8/s
At adb PromptDisplay the message buffer:$<msgbufmsgbuf+14s
msgbuf+10/s
Get the core informations:$>coreinfo
Get crash informations:$>system$>status
Display the panic string:$>panicinfo*panicstr/s
Show the crash log:$>crashlog
Get the thread list:$< threadlist
Check the status:$>status
Quit the debugger:$>q
9. Generate / Analyze a Crash Dump on Linux
9.1 Enable Saving Crash Dump by using kexex-tools
Check the Presence of kdump Tool:yum search kexec-toolschkconfig --list | grep kdumpmore /etc/kdump.conf
OR:/etc/init.d/kdump status
If necessary, Add the Line to Yum Repository on Red Hat:vi /etc/yum.repos.d/rhel-debuginfo.repo baseurl=ftp://ftp.redhat.com/pub/redhat/linux/enterprise/$releasever/en/os/$basearch/Debuginfo/
Enable Repository:yum install --enablerepo rhel-debuginfo httpd-debuginfo
OR for CentOS:vi /etc/yum.repos.d/centos-debuginfo.repo baseurl=http://debuginfo.centos.org/$releasever/$basearch/
Enable Repository:yum install --enablerepo centos-debuginfo httpd-debuginfo
Install kexec-tools:yum install kexec-tools
Check or Edit /etc/kdump.conf File According to your Needs:vi /etc/kdump.confmore /etc/sysconfig/kdump
Backup and Edit /boot/grub/grub.conf and Append to the End of the Kernel Line "crashkernel=128M@16M":cp /boot/grub/grub.conf /boot/grub/grub.conf.bkpvi /boot/grub/grub.conf
OR:cp /boot/grub/menu.lst /boot/grub/menu.lst.bkpvi /boot/grub/menu.lstkernel /boot/vmlinuz-2.6.18-128.1.16.el5 ro root=LABEL=/ rhgb quiet crashkernel=128M@16M
Enable kdump Service:chkconfig kdump onchkconfig kdumpchkconfig --list | grep kdump
OR:/etc/init.d/kdump start/etc/init.d/kdump status
Reboot the System:reboot
9.2 Symulate a Panic and Save a Crash Dump
There are different ways to simulate a panic. The following are the most common:
echo 1 > /proc/sys/kernel/sysrqecho c > /proc/sysrq-trigger
OR:echo 1 > /proc/sys/kernel/sysrq
On the system console type:Alt-SysRq-u
All filesystems will be re-mounted as read-only: this saves the system from running fsck on all the file systems when the system reboots.
On the system console type:Alt-SysRq-c
This will force the system to panic and a crash dump to be taken.
9.3 Analyze Crash Dump by using crash
On CentOS 5 and 6, Download and Install the kernel-debuginfo and kernel-debuginfo-common Packages:wget http://debuginfo.centos.org/5/`uname -i`/kernel-debuginfo-`uname -r`.i686.rpmwget http://debuginfo.centos.org/5/`uname -i`/kernel-debuginfo-common-`uname -r`.i686.rpm
OR:wget http://debuginfo.centos.org/5/`uname -i`/kernel-debuginfo-`uname -r`.i686.rpmwget http://debuginfo.centos.org/5/`uname -i`/kernel-debuginfo-common-`uname -r`.i686.rpm
OR for Red Hat 5:wget ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/x86_64/Debuginfo/kexec-tools-debuginfo-1.102pre-96.el5_5.4.x86_64.rpm
OR for SuSE:wget ftp5.gwdg.de/pub/opensuse/repositories/Kernel:/kdump/openSUSE_11.1/x86_64/kexec-tools-debuginfo-2.0.0-58.1.x86_64.rpm
rpm -Uvh kernel-debuginfo*
Check the crash dump files:cd /var/crash/2009-06-09-20\:18ls –lrtfile vmcorestrings vmcore
Start crash and Analyze Output:crash /usr/lib/debug/lib/modules/crashed-kernel-version/vmlinux /var/crash/2009-06-09-20\:18/vmcore
OR:crash /usr/lib/debug/lib/modules/`uname -r`/vmlinux /var/crash/2009-06-09-20\:18/vmcore | tee /var/crash/crash3.log
At crash PromptView System Data:crash> sys
Get Info About Open Files:crash> files
Display Processes Status:crash> ps
Display Virtual Memory Info:crash> vm
View Stack Traces:crash> bt -a
Display Modules Info and Loading of Symbols and Debugging Data:crash> mod
Dump Kernel Log Buffer Contents in Chronological Order:crash> log
Analyze EIP Address (from the Preceeding Output):crash> dis -lr c04a9c34
Exit:crash> exit
Running crash in Unattended Mode
You can run crash in unattended (non-interactive) mode by creating an input file containing the commands you want to pass to crash.
Generate an Input File Containing Commands:vi inputfilebtlogpsexit
Run Crash:crash -i inputfile
OR:crash < inputfile
OR:crash <debuginfo> vmcore < inputfile > outputfile
OR:crash <System map> <vmlinux> vmcore < inputfile > outputfile
9.4 Analyze Crash Dump by using GDB
Check the crash dump files:cd /var/crash/2009-06-09-20\:18ls –lrtfile vmcorestrings vmcore
Start gdb on core file:gdb -c core
OR:
gdb a.out core
OR:gdb path/to/the/binary path/to/the/coreobjdump -d -S null-pointer.ko > /tmp/whatever
OR by gdb Prompt:(gdb) core core
At gdb Prompt, To Analyze BackTrace:bt
Check Status:(gdb) status
View Data:(gdb) data
View Stacks:(gdb) stack
Analyze a Stack by its Number:(gdb) frame number
View Code around that Stack:(gdb) list
List Variables:(gdb) info locals
View Files:(gdb) files
View Internals:(gdb) internals
View Command Aliases:(gdb) aliases
Check Support Facilities:(gdb) support
Running Program:(gdb) running
Quit the debugger:(gdb) quit
9.5 Analyze Crash Dump by using LKCD
The Linux Kernel Crash Dump (LKCD) is a project that provides a a reliable method of detecting, saving and examining system crashes.
Download the current lkcdutils rpm and the patches from here and upload the packages onto the server.The installation of LKCD requires the kernel patches installation, a new kernel to be built and the LKCD utilities to be installed.
Make a copy of the kernel source directory:cp -r /usr/src/linux-x.x.x /usr/src/linux-x.x.x.lkcd
Access the newly-created directory:cd /usr/src/linux-x.x.x.lkcd
Test the patches:patch -p1 --dry-run < <path>/lkcd-x.x.x.diff
If the previous command did not report any errors, then apply the kernel patches:patch -p1 < <path>/lkcd-x.x.x.diff
Configure the kernel adding LKCD support (compiled into kernel, not as a module) and enabling Magic SysRq Keys (Magic SysRq Keys is not a mandatory but it will allow a crash to be created when a system has hung):make menuconfig
Navigate to “Kernel Hacking” and type <enter>.Navigate to “Magic SysRq key” and type <space> an asterisk should appear next to the line Magic SysRq key.Navigate to “Linux Kernel Crash Dump (LKCD)” and type <space> until an asterisk appears. If compression options are presented select all available.Press<tab> <enter> twice until you are prompted to save configuration: type <enter> to save and exit menuconfig.
Build the new kernel:make dep; make bzImage
Install the kernel image:make install
The kernel build process will have built the file Kerntypes in the kernel source directory: check wether this file was copied to the /boot directory and if needed copy this files yourself:cp Kerntypes /boot
The kernel build process builds the file System.map in the kernel build directory, and the kernel install process copies this file into the /boot directory: check that /boot/System.map matches the copy in the kernel source directory:diff System.map /boot/System.map
If the two files do not match, then make a fresh copy in the /boot directory:cp System.map /boot
Reboot with new kernel:init 6
As the system is up and running, check out the /proc/sys/dump directory exists:ls -d /proc/sys/dump
If the directory is missing the kernel has not been patched or configured properly for LKCD.
Once the kernel is patched, install the LKCD Utilities rpm:rpm -i lkcdutils-x_x-x_xxxx.rpm
Edit the system startup script /etc/rc.sysinit on Red Hat and CentOS or /sbin/init.d/boot on SuSE (to find the system startup script for yout distribution issue the command “grep sysinit /etc/inittab”). Locate the lineaction $"Mounting local filesystems: "mount -a -tnonfs,smbfs,ncpfs
Following this line add this text:/sbin/lkcd config
If you are using a swap partition as the dump device, then the dump must be save before swap is activated. Locate the line with the “swapon” command in the system startup script and change it link this (adding the lkcd commands above it):/sbin/lkcd config/sbin/lkcd save# Start up swapping.action $"Activating swap partitions: " swapon -a –e
Configure the device on which to save the crash dump by creating a symbolic link to the chosen device and updating the LKCD configuration:df –kcat /proc/partitionsln -s /dev/sdb1 /dev/vmdump/sbin/lkcd config
Enable the Magic SysRq key with the following command:echo 1 > /proc/sys/kernel/sysrq
Check or Edit Configuration File According to your Needs:vi /etc/sysconfig/dump
The parameter DUMP_ACTIVE must be set to 1 to enable the dump process.Set DUMP_SAVE to 1 if you want to save the memory image to disk.Define the DUMP_LEVEL: 0 nothing, 1 dump the dump header and first 128K bytes out, 4 dump everything except the kernel free pages, 8 dump all memory.Set DUMP_COMPRESS to 0 if you do not want the dump to be compressed, to 1 if you want to use rle compression or to 2 for gzip compression.An example of dump configuration file:DUMP_ACTIVE=1DUMPDEV=/dev/vmdumpDUMPDIR=/var/log/dumpDUMP_SAVE=1DUMP_LEVEL=8DUMP_FLAGS=0DUMP_COMPRESS=0
PANIC_TIMEOUT=5
After changing the configuration, update and enable Crash Dump Saving:lkcd config
Check Configuration Settings:lkcd query
Setup the Service to Start at Boot:chkconfig boot.lkcd on
Test the LKCD.On the system console type:Alt-SysRq-u
All filesystems will be re-mounted as read-only: this saves the system from running fsck on all the file systems when the system reboots.
On the system console type:Alt-SysRq-c
This will force the system to panic and a crash dump to be taken.If the system startup scripts don't contain the “lkcd save” command, then create the dump files:/sbin/lkcd save
As the system is back up and running, check the dump files have been created:cd /var/log/dump/0ls -lrt
Invoke LKCD lcrash:/sbin/lcrash map.0 dump.0 kerntypes.0OR:/sbin/lcrash –n 0
At the lcrash Prompt,Get a list of processes running at the time of the crash: >>ps
Display system statistics and the log_buf array:>>stat
Display the crash dump report:>>report>>report –w outfile
Display dump:>>dump>>dump c02e4820 8 −o>>dump c02e4820 8 −d>>dump c02e4820 8 −x
List opened namelists:>>namelist>>namelist −a /tmp/snd.o
Display modules informations:>>module>>module pcmcia_core>>module pcmcia_core –f>>module kernel_module −f −i 10
Display page structures informations:>>page
Evaluate and print expressions:>>print
Dynamically load a library of lcrash commands:>>ldcmds
Displays all complete and unique stack traces:>>strace
Display stack trace for task_struct:>>trace
Display information for task_struct structs:>>task
List symbol table informations:>>symtab –l
List symbols in the specified module:>>symtab −l −f /tmp/my_dummy.map
Removing symbol table:>>symtab −r /tmp/my_dummy.map
Recreating and reloading symbol table:>>symtab −a __ksymtab__>>symtab −a /tmp/my_dummy.map my_dummy>>symtab –l
Walk a linked list of kernel structures or memory blocks:>>walk
Examine a local variable:>>whatis DUMMY>>print *(dummy_t*) d0000240>> whatis dummy_s.member2
Display disassembled code:>>dis −F memcmp>>dis 0xc025188e 10 −f
Quit lcrash:>>q
9.6 Other Useful Commands
Examine a running kernel after a crash can be very useful to check wether it’s experiencing issues:cat /proc/sys/kernel/tainted
If a module, a library or a program is suspected to having caused a panic, you can dump/disassemble it:objdump -D -S <compiled_object_with_debug_symbols> > filename.out
10. Generate / Analyze a Crash Dump on Linux
10.1 Setup and Enable KDB
KDB is an interactive kernel debugger shipped with IBM AIX operating system.
kdb allows the user to control execution of kernel code (including kernel extensions and device drivers), and to observe and modify the variables and register. It has to be invoked by a special boot image.
The kdb is a tool/command for analysing the system dumps. It is used for post-mortem analysis of system dumps, or for monitoring the running kernel.
Check Current Dump Device(s):sysdumpdev -l
Start the System Dump:sysdumpstart -p
Check the Minimum Size for the Dump Device:sysdumpdev -e
Enable the KDB, but Not Invoke it at Boot:bosboot -a -d /dev/ipldevice -D
Enable the KDB and Invoke it at Boot:bosboot -a -d /dev/ipldevice -D
Disable the KDB:bosboot -a -d /dev/ipldevice
Check if KDB is Available:kdb(0)>dw kdb_avail(0)>dw kdb_wanted
Find the Dump Object:lsnim -l worker
10.2 Analyze a Crash Dump by using KDB
Check Current Dump Device(s):sysdumpdev -l
Check if KDB is Available:kdb>dw kdb_avail
Find the Dump Object:lsnim -l worker
Access the crash dump directory:cd /var/crash/ls –lrt
View the content of the snap package:zcat snap.pax.Z | pax –v
Exctract the content of the snap package:zcat snap.pax.Z | pax -r
OR extract just the dump, general, and kernel subdirectories:uncompress snap.pax.Zzcat snap.pax.Z | pax -r ./dump ./general ./kernel
Check the Timestamps of Dump and UNIX Files:what unix | grep _kdb_buildinfowhat dump | grep _kdb_buildinfowhat /usr/sbin/kdb_64 | grep _kdb_buildinfowhat /usr/sbin/kdb_mp | grep _kdb_buildinfo
Analyze a Core:kdb /var/adm/ras/vmcore.0 /unix
At kdb Prompt, Display system statistics that include the last kernel printf() messages still in memory:>stat
Display all of the stack frames from the current instruction as deep as possible (interrupts, system calls, user stack):>f
Display informations about what’s currently running on each processor:>status
Display the symptom string for a dump:>symptom
Show system log entries not processed by the log daemon:>errpt
Shows the global error-logging control informations:>errlg -g
Shows the error-logging control informations about the specified address:>errlg –a address
Show dump-time.trace informations:>dmptrc
Displays information about the Lightweight Memory Trace (LMT):>mtrc all -v
Displays information about the Lightweight Memory Trace (LMT) for CPU 0:>mtrc –C 0 –v
Dump the event buffer on channel 2, related to Thread ID 14539 for an active system trace:>trace –c 2 –t 14539
Initial CPU Context:>cpu 1
Get Breakpoints:>brk
Display the Stack om Raw Format:>dw @r1 90
Display All of the Function Addresses:>devsw
Display data at ustname:>dw utsname
Find physical address at ustname:>tr utsname
Get Machine State:>mst
Get Machine State Register:>mrs
Dump the Content of the Machinte State Register:>dr msr
VMM Error Log:>vmlog
Display informations about component dump tables in a system memory dump:>cdt>cdt 11>cdt -p 11 7
Process Info:
>proc *
Display file table:>file
Print “intr” Symbol:>pr -p intr
Show symbols matching “*r”:>pr -p *r
Print following the next pointer:>pr -l next intr 30047A80
Display iNode table:>ino
Print details of the inode pool:>jfsnode
Print gfs slot 1:>gfs gfs
Display either the Enhanced Journaled File System (JFS2) d-tree or x-tree structure based on the specified inode parameter:>tree 325C1080
Print gfs slot 2:>gfs gfs+30
Print gfs slot 3:>gfs gfs+60
pid Output:>p 3
Get Threads:> thread *
Print current thread:>tpid
Show VMM free lsit informations:>freelist
Tid Output:>th 12
kdb Output:>p *
Get the Address of the Symbol and Table Of Contents Section of the Executable Module:>nm>nm vmerrlog
Display the inpcb structure for TCP connections:>tcb –s
Display the inpcb structure for UDP connections:>udb –s
Print the socket structure for TCP and UDP sockets:>sock –s
Display data structure (mbuf) informations (mbufs are used to store data in the kernel for incoming and outbound network traffic):>mbuf –p
Display data structure (mbuf) informations and follow the packet chain:>mbuf -a
Follow the mbuf structure within a packet:>mbuf –n effectiveaddress
Display the list of all of the valid network device driver tables and gives the address of each ndd structure and the name of the corresponding network interface:>ndd –s
Display network connections at the time of the crash:>netstat -an
Display network interface informations:>ifnet
Display the list of kernel data structures checkers:>check
Display informations about the specified kernel data structures checker:>check –h proc
Run the proc checker to validate the entire process table:>check –l 7 proc
Exit the KDB:>g
11. Debugging Tools
11.1 Informations
In this section you can find a collection of debugging tools for the main UNIX and Linux operating system.
GDB:
The GNU Project Debugger allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed. It is available for different UNIX operating systems and Linux distributions.
HP tusc:Tusc traces system calls invoked by a process. It works with HP-UX 11.0 and 11i PA-RISC systems, and HP-UX 11i HP Integrity systems. It is not supported on HP-UX 10.20. tusc is similar in functionality to truss on Solaris.
HP Wildebeest Debugger (WDB):HP WDB is an HP-supported implementation of the Open Source GNU debugger (GDB). It is available for different HP-UX versions and architectures.
Linux Kernel Crash Dump:LKCD is a project is designed to meet the needs of customers and system administrators wanting a reliable method of detecting, saving and examining system crashes. It is available for different Linux distributions.
DTrace Toolkit:DTrace Toolkit is a collection of DTrace scripts to debug and deep diving a system:you can download here the current version (0.99).
DTrace TazTool:DTrace TazTool is the DTrace version of the program “taztool”, a disk trace tool developed by Richard McDougall which takes the TNF disk trace records and matches them up in pairs for the start and end of a disk transaction.DTrace TazTool could be though as a taztool evolution: the last version is the 0.51 and it can be downloaded here.If you want, you can also download taztool 1.1 (as you’ll notice its package name is RMCtaz).
Dexplorer:DExplorer automatically runs a collection of DTrace scripts to examine many areas of the system, and places the output in a meaningful directory structure that is tar'd and gzip'd.You can download the current version, 0.70.
Lsof for HP:Lsof lists files, sockets, inodes, etc… opened by processes.
Lsof for Solaris:You can find lsof for Solaris 10 SPARC and x86 on http://www.sunfreeware.com: you have to create a free account to download packages from this site.You can find packages for the previous versions of Solaris on http://unixpackages.com/: packages on this site are not freeware as you need to buy a subscription (a single-user subscription currently costs $20/Year.
SE Toolkit:The SE Toolkit is a collection of scripts for performance analysis and gives advice on performance improvement. It has been a standard in system performance monitoring for the Solaris platform over the last 10 years.
XE Toolkit:The XE Toolkit is a multi-platform, network-aware, secure performance monitoring solution for tactical analysis of enterprise computing systems.
NMON:This Solaris system monitoring tool allows to perform standard SAR activity reporting and NMON activity reporting. The NMON output can be imported with Excel or RRD to output simple and efficient graphs.
Ksar:ksar is a sar graphing tool that can graph for now linux,mac and solaris sar output. sar statistics graph can be output to a pdf file.
Sar2html:sar2html converts sar binary data to graphical html format. It has command line, web interface and data collection script. HPUX 11.11, 11.23, 11,31, Redhat 3, 4, 5, 6 Suse 8, 9, 10, 11 and Solaris 5.9, 5.10 are supported.
Sarface:sarface is a user-interface to the sysstat/sar database which inputs data from sar and plots to a live X11 graph via gnuplot. It mimics the cmd-line options from sar but can cross-plot any two or more stats and apply simple mathematical functions them.
Visual SAR:Visual SAR is a Java graphical interpreter of an Unix sar command. It reads a sar output from a file and show it in a graphical format. Visual SAR allow a quick interpretation of a server behavior in several days.
Sarvant:Sarvant (SAR Visual ANalysis Tool) is a python script that will analyze a sar file (from the sysstat utility, 'sar') and produce graphs using gnuplot of the collected data.
Sarparse:Sarparse is a utility based off of cacti to graph Sar metrics from remote hosts. It require NRPE and SAR to run out-of-the box but could easily be modified for any other transport.