Upload
phamnhan
View
237
Download
5
Embed Size (px)
Citation preview
http://d3s.mff.cuni.czCrash Dump Analysis 2014/2015
CHARLES UNIVERSITY IN PRAGUE
faculty of mathematics and physicsfaculty of mathematics and physics
SystemTapSystemTap
Crash Dump Analysis 2014/2015 2SystemTap
SystemTapSystemTap
Dynamic analysis frameworkInspired by DTraceLinux, GPLUnder heavy developmentSuitable for production usage
Motto: “Painful to use, but more painful not to”
Crash Dump Analysis 2014/2015 3SystemTap
TracingTracing
Capture information about executionOften specialized tools (strace)Limited filteringOverwhelming amount of data
Crash Dump Analysis 2014/2015 4SystemTap
ProfilingProfiling
Sampling during executionBoth targeted and system-wideOften just one metric: time
Crash Dump Analysis 2014/2015 5SystemTap
DebuggingDebugging
Full context (memory, registers, backtrace)BreakpointsTargeted
Crash Dump Analysis 2014/2015 6SystemTap
EnvironmentsEnvironments
DevelopmentRich environment: tools, debuginfoExclusive access and control
ProductionDifferent from developmentUnder loadValuableVirtualized / containerized
Crash Dump Analysis 2014/2015 7SystemTap
SystemTap to rule them all, in productionSystemTap to rule them all, in production
Tracing, profiling, debuggingSystem-wideMonitoring multiple event types (and layers)SafeLow overheadControlled dependencies
Crash Dump Analysis 2014/2015 8SystemTap
Basic principlesBasic principles
EventsInteresting points in running systemDifferent abstraction levels: code line/model eventKernel and user space
ActionsScripting languageAccess to program state, capturing, processingEven state modification if you are bold
Crash Dump Analysis 2014/2015 9SystemTap
EventsEvents
Low-levelFunction entry and returnCode lines
High-levelTimersCode tracepointsTapsets
Crash Dump Analysis 2014/2015 10SystemTap
Events: ProbesEvents: Probes
Probe: Basic SystemTap elementBind actions to events
Probe specification: Defines interesting eventsProbe hit: Specific event occurrenceProbe handler: Action to perform when hit
Crash Dump Analysis 2014/2015 11SystemTap
Language: ProbesLanguage: Probes
probe kernel.function(“vfs_read”) {
log(“vfs_read was called!”)
}
keyword providerevent
specification
event handler
Crash Dump Analysis 2014/2015 12SystemTap
Language: ProbesLanguage: Probes
Probes
Crash Dump Analysis 2014/2015 13SystemTap
Probes: GenericProbes: Generic
beginend errortimer.jiffies()timer.ms()
Crash Dump Analysis 2014/2015 14SystemTap
Probes: System callsProbes: System calls
Needs DWARF debuginfosyscall.NAMEsyscall.NAME.return
Do not need DWARF debuginfond_syscall.NAMEnd_syscall.NAME.return
Crash Dump Analysis 2014/2015 15SystemTap
Probes: Kernel DWARFProbes: Kernel DWARF
kernel.function(PATTERN)module(PATTERN).function(PATTERN)kernel.statement(PATTERN)module(PATTERN).statement(PATTERN)
Provides access to context variables, arguments, reachable memory etc.
Crash Dump Analysis 2014/2015 16SystemTap
Probes: Modifiers and variantsProbes: Modifiers and variants
function.returnfunction.call / function.inlinefunction.callee(NAME)function.callees(DEPTH)
Crash Dump Analysis 2014/2015 17SystemTap
Probes: Kernel DWARF-lessProbes: Kernel DWARF-less
kprobe.function(FUN) + .returnkprobe.module(MOD).function(FUN) + .return
No access to variables etc.
Crash Dump Analysis 2014/2015 18SystemTap
Probes: Kernel tracepointsProbes: Kernel tracepoints
kernel.trace(PATTERN)
Static markers in kernelFaster and more reliable mechanismAccess to macro arguments
Crash Dump Analysis 2014/2015 19SystemTap
Probes: Userspace DWARFProbes: Userspace DWARF
process(PATH/PID).function(NAME)process(PATH/PID).statement(PATTERN)process(P).library(P).function(NAME)process(P).library(P).statement(PATTERN)
Provides access to context variables, arguments, reachable memory etc.
Crash Dump Analysis 2014/2015 20SystemTap
Probes: Userspace via utraceProbes: Userspace via utrace
process({,PATH,PID}).{begin,end}process({,PATH,PID}).thread.{begin,end}process({,PATH,PID}).syscall + .return
Crash Dump Analysis 2014/2015 21SystemTap
Probes: Userspace static markersProbes: Userspace static markers
process(PID/PATH).mark(LABEL)process(P).provider(PROVIDER).mark(LABEL)
Static markers put by developers to applicationSTAP_PROBE/DTRACE_PROBE macrosAccess to macro arguments
Crash Dump Analysis 2014/2015 22SystemTap
Probes: PythonProbes: Python
Markers in Python runtimeTapset to work with Python structure
Crash Dump Analysis 2014/2015 23SystemTap
Probes: JavaProbes: Java
Prototype featureByteman backend
Example:probe java(PNAME/PID).class(NAME).method(METHOD)
Crash Dump Analysis 2014/2015 24SystemTap
Probes: Find available probesProbes: Find available probes
stap -l 'probe definition'
Example:$ stap -l 'kernel.trace(“kmem:*”)'
Crash Dump Analysis 2014/2015 25SystemTap
ActionsActions
Actions
Crash Dump Analysis 2014/2015 26SystemTap
Language: Elements [1/2]Language: Elements [1/2]
Functions: built-in, user defined, tapsetsProbe aliasesControl structures: conditionals, loops, ternary opVariables
Scalar: string, integers, no declarationsMulti-key associative arrays (global only)Global or local scope
Aggregates
Crash Dump Analysis 2014/2015 27SystemTap
Language: Elements [2/2]Language: Elements [2/2]
Operators: Usual ops, member operatorPointer typecastingConditional compilationSimple preprocessor macrosEmbedded C
Crash Dump Analysis 2014/2015 28SystemTap
Language: ReadingLanguage: Reading
Context and tapset functions pid(), execname(), print_backtrace()
Context variablesPrefixed by '$'$ stap -L 'probe kernel.DEF'
Tapset variablesSet by a tapset in a prologue
Pretty printers and groups$$vars, $$locals, $$parms, $$parms$, $$parms$$
Crash Dump Analysis 2014/2015 29SystemTap
Language: ParametersLanguage: Parameters
$ stap script.stp param1 param2 ...Unquoted: $1, $2...Strings: @1, @2...
Crash Dump Analysis 2014/2015 30SystemTap
Language: OutputLanguage: Output
Functionslog()printf()sprintf()...more...
Crash Dump Analysis 2014/2015 31SystemTap
Language: AggregatesLanguage: Aggregates
Aggregates: Support for data accumulation and statistics
aggregate <<< valuearray[execname(), pid()] <<< value@count, @sum, @min, @max, @avg@hist_linear(agg, LOW, HIGH, WIDTH)
Crash Dump Analysis 2014/2015 32SystemTap
Language: QuirksLanguage: Quirks
Often quirky syntactic shortcuts@entry(expression): saves expression value in entry probe for usage in return probeforeach(v = [i,j] in array)
Read language reference...
Crash Dump Analysis 2014/2015 33SystemTap
TapsetsTapsets
Tapset = “Library”Collection of SystemTap functions and aliasesEncapsulates internals via aliasesProvides convenience functions
Example: /usr/share/systemtap/tapset/linux/syscalls2.stp
Crash Dump Analysis 2014/2015 34SystemTap
UsageUsage
Usage
Crash Dump Analysis 2014/2015 35SystemTap
Using SystemTapUsing SystemTap
Write a scriptOne-shot script executionLong-term result collectionFlight-recorder mode
Crash Dump Analysis 2014/2015 36SystemTap
SystemTap passesSystemTap passes
Pass 1: ParserPass 2: Process probes, data structures etc.Pass 3: Translate to C (cached)Pass 4: Compile as a kernel module (cached)
Pass 5: Insert module
Crash Dump Analysis 2014/2015 37SystemTap
SystemTap translator / driverSystemTap translator / driver
Command: stapHigh-level driver commandAll or some passesSystem-wide or targeted
Examples:$ stap -e 'probe kernel.trace(“kmem:*”)
$ stap script.stp
$ stap -p4 script.stp
$ stap (…) -c 'command'
$ stap (…) -x PID
Crash Dump Analysis 2014/2015 38SystemTap
SystemTap runtimeSystemTap runtime
Command: staprunLoads a SystemTap probe kernel modulestap = stap -p4 + staprun
Examples:$ staprun stap_ee1d4debf0add5c64f0b095_1991.ko
Crash Dump Analysis 2014/2015 39SystemTap
Permission modelPermission model
Privileged operationsTherefore, SystemTap needs either:
root useruser in groups: stapusr+stapdevuser in groups: stapusr+stapsysuser in groups: stapusr
Crash Dump Analysis 2014/2015 40SystemTap
Permission modelPermission model
stapusr + stapdevCan run SystemTap as if root
stapusr + stapsysCan run prebuilt, signed modulesCan run limited functionality scripts
Avoid damage to system
stapusrCan run prebuilt and/or signed modulesCan run limited functionality scripts
Disallow access to other user informationDisallow harming other user process performance
Crash Dump Analysis 2014/2015 41SystemTap
Flight recorder and SystemTap serviceFlight recorder and SystemTap service
Flight recorder$ stap -F …Load module and detach......or run staprun as a daemon
SystemTap service/etc/systemtap/script.d/script.stp/etc/systemtap/conf.d/script.conf$ service systemtap start script
Crash Dump Analysis 2014/2015 42SystemTap
Guru modeGuru mode
Fun, useful and a little bit perilous!Bypass security limitsEmbedded CWrite to context variablesUsage: hotfixes, hacks, showcases, fun
Example:$ stap -g -e 'probe syscall.kill {if (pid==$1) { $pid = pid() } }' PID
Crash Dump Analysis 2014/2015 43SystemTap
SystemTap compile serverSystemTap compile server
ServerHas SystemTap translator and module builderHas DWARF debuginfoBuilds and signs SystemTap modules
ClientHave just SystemTap client and runtime
Crash Dump Analysis 2014/2015 44SystemTap
Final wordsFinal words
Final words
Crash Dump Analysis 2014/2015 45SystemTap
SystemTap: Problem solving processSystemTap: Problem solving process
Determine involved componentsKernel, executables, libraries
Investigate possible probe pointsWhat events are interesting?What events carry interesting data?
Determine desired outputHistogram, specific knowledge, data set
Start with generic probes, then limit
Crash Dump Analysis 2014/2015 46SystemTap
SystemTap: PatternsSystemTap: Patterns
Event logger: simple output on eventEvent detector: complicated filtering in probesTop: data collection, printing in timer.msData collection: runs until cancel, end probe
Crash Dump Analysis 2014/2015 47SystemTap
SystemTap: ResourcesSystemTap: Resources
man pagesstap, stapprobes
Exampleshttps://sourceware.org/systemtap/examples/
Tutorials and articleshttps://sourceware.org/systemtap/wiki
Language, tapset referenceshttps://sourceware.org/systemtap/documentation.html