47
http://d3s.mff.cuni.cz Crash Dump Analysis 2014/2015 CHARLES UNIVERSITY IN PRAGUE faculty of mathemacs and physics faculty of mathemacs and physics SystemTap SystemTap

SystemTap SystemTap

Embed Size (px)

Citation preview

Page 1: SystemTap SystemTap

http://d3s.mff.cuni.czCrash Dump Analysis 2014/2015

CHARLES UNIVERSITY IN PRAGUE

faculty of mathematics and physicsfaculty of mathematics and physics

SystemTapSystemTap

Page 2: SystemTap SystemTap

Crash Dump Analysis 2014/2015 2SystemTap

SystemTapSystemTap

Dynamic analysis frameworkInspired by DTraceLinux, GPLUnder heavy developmentSuitable for production usage

Motto: “Painful to use, but more painful not to”

Page 3: SystemTap SystemTap

Crash Dump Analysis 2014/2015 3SystemTap

TracingTracing

Capture information about executionOften specialized tools (strace)Limited filteringOverwhelming amount of data

Page 4: SystemTap SystemTap

Crash Dump Analysis 2014/2015 4SystemTap

ProfilingProfiling

Sampling during executionBoth targeted and system-wideOften just one metric: time

Page 5: SystemTap SystemTap

Crash Dump Analysis 2014/2015 5SystemTap

DebuggingDebugging

Full context (memory, registers, backtrace)BreakpointsTargeted

Page 6: SystemTap SystemTap

Crash Dump Analysis 2014/2015 6SystemTap

EnvironmentsEnvironments

DevelopmentRich environment: tools, debuginfoExclusive access and control

ProductionDifferent from developmentUnder loadValuableVirtualized / containerized

Page 7: SystemTap SystemTap

Crash Dump Analysis 2014/2015 7SystemTap

SystemTap to rule them all, in productionSystemTap to rule them all, in production

Tracing, profiling, debuggingSystem-wideMonitoring multiple event types (and layers)SafeLow overheadControlled dependencies

Page 8: SystemTap SystemTap

Crash Dump Analysis 2014/2015 8SystemTap

Basic principlesBasic principles

EventsInteresting points in running systemDifferent abstraction levels: code line/model eventKernel and user space

ActionsScripting languageAccess to program state, capturing, processingEven state modification if you are bold

Page 9: SystemTap SystemTap

Crash Dump Analysis 2014/2015 9SystemTap

EventsEvents

Low-levelFunction entry and returnCode lines

High-levelTimersCode tracepointsTapsets

Page 10: SystemTap SystemTap

Crash Dump Analysis 2014/2015 10SystemTap

Events: ProbesEvents: Probes

Probe: Basic SystemTap elementBind actions to events

Probe specification: Defines interesting eventsProbe hit: Specific event occurrenceProbe handler: Action to perform when hit

Page 11: SystemTap SystemTap

Crash Dump Analysis 2014/2015 11SystemTap

Language: ProbesLanguage: Probes

probe kernel.function(“vfs_read”) {

log(“vfs_read was called!”)

}

keyword providerevent

specification

event handler

Page 12: SystemTap SystemTap

Crash Dump Analysis 2014/2015 12SystemTap

Language: ProbesLanguage: Probes

Probes

Page 13: SystemTap SystemTap

Crash Dump Analysis 2014/2015 13SystemTap

Probes: GenericProbes: Generic

beginend errortimer.jiffies()timer.ms()

Page 14: SystemTap SystemTap

Crash Dump Analysis 2014/2015 14SystemTap

Probes: System callsProbes: System calls

Needs DWARF debuginfosyscall.NAMEsyscall.NAME.return

Do not need DWARF debuginfond_syscall.NAMEnd_syscall.NAME.return

Page 15: SystemTap SystemTap

Crash Dump Analysis 2014/2015 15SystemTap

Probes: Kernel DWARFProbes: Kernel DWARF

kernel.function(PATTERN)module(PATTERN).function(PATTERN)kernel.statement(PATTERN)module(PATTERN).statement(PATTERN)

Provides access to context variables, arguments, reachable memory etc.

Page 16: SystemTap SystemTap

Crash Dump Analysis 2014/2015 16SystemTap

Probes: Modifiers and variantsProbes: Modifiers and variants

function.returnfunction.call / function.inlinefunction.callee(NAME)function.callees(DEPTH)

Page 17: SystemTap SystemTap

Crash Dump Analysis 2014/2015 17SystemTap

Probes: Kernel DWARF-lessProbes: Kernel DWARF-less

kprobe.function(FUN) + .returnkprobe.module(MOD).function(FUN) + .return

No access to variables etc.

Page 18: SystemTap SystemTap

Crash Dump Analysis 2014/2015 18SystemTap

Probes: Kernel tracepointsProbes: Kernel tracepoints

kernel.trace(PATTERN)

Static markers in kernelFaster and more reliable mechanismAccess to macro arguments

Page 19: SystemTap SystemTap

Crash Dump Analysis 2014/2015 19SystemTap

Probes: Userspace DWARFProbes: Userspace DWARF

process(PATH/PID).function(NAME)process(PATH/PID).statement(PATTERN)process(P).library(P).function(NAME)process(P).library(P).statement(PATTERN)

Provides access to context variables, arguments, reachable memory etc.

Page 20: SystemTap SystemTap

Crash Dump Analysis 2014/2015 20SystemTap

Probes: Userspace via utraceProbes: Userspace via utrace

process({,PATH,PID}).{begin,end}process({,PATH,PID}).thread.{begin,end}process({,PATH,PID}).syscall + .return

Page 21: SystemTap SystemTap

Crash Dump Analysis 2014/2015 21SystemTap

Probes: Userspace static markersProbes: Userspace static markers

process(PID/PATH).mark(LABEL)process(P).provider(PROVIDER).mark(LABEL)

Static markers put by developers to applicationSTAP_PROBE/DTRACE_PROBE macrosAccess to macro arguments

Page 22: SystemTap SystemTap

Crash Dump Analysis 2014/2015 22SystemTap

Probes: PythonProbes: Python

Markers in Python runtimeTapset to work with Python structure

Page 23: SystemTap SystemTap

Crash Dump Analysis 2014/2015 23SystemTap

Probes: JavaProbes: Java

Prototype featureByteman backend

Example:probe java(PNAME/PID).class(NAME).method(METHOD)

Page 24: SystemTap SystemTap

Crash Dump Analysis 2014/2015 24SystemTap

Probes: Find available probesProbes: Find available probes

stap -l 'probe definition'

Example:$ stap -l 'kernel.trace(“kmem:*”)'

Page 25: SystemTap SystemTap

Crash Dump Analysis 2014/2015 25SystemTap

ActionsActions

Actions

Page 26: SystemTap SystemTap

Crash Dump Analysis 2014/2015 26SystemTap

Language: Elements [1/2]Language: Elements [1/2]

Functions: built-in, user defined, tapsetsProbe aliasesControl structures: conditionals, loops, ternary opVariables

Scalar: string, integers, no declarationsMulti-key associative arrays (global only)Global or local scope

Aggregates

Page 27: SystemTap SystemTap

Crash Dump Analysis 2014/2015 27SystemTap

Language: Elements [2/2]Language: Elements [2/2]

Operators: Usual ops, member operatorPointer typecastingConditional compilationSimple preprocessor macrosEmbedded C

Page 28: SystemTap SystemTap

Crash Dump Analysis 2014/2015 28SystemTap

Language: ReadingLanguage: Reading

Context and tapset functions pid(), execname(), print_backtrace()

Context variablesPrefixed by '$'$ stap -L 'probe kernel.DEF'

Tapset variablesSet by a tapset in a prologue

Pretty printers and groups$$vars, $$locals, $$parms, $$parms$, $$parms$$

Page 29: SystemTap SystemTap

Crash Dump Analysis 2014/2015 29SystemTap

Language: ParametersLanguage: Parameters

$ stap script.stp param1 param2 ...Unquoted: $1, $2...Strings: @1, @2...

Page 30: SystemTap SystemTap

Crash Dump Analysis 2014/2015 30SystemTap

Language: OutputLanguage: Output

Functionslog()printf()sprintf()...more...

Page 31: SystemTap SystemTap

Crash Dump Analysis 2014/2015 31SystemTap

Language: AggregatesLanguage: Aggregates

Aggregates: Support for data accumulation and statistics

aggregate <<< valuearray[execname(), pid()] <<< value@count, @sum, @min, @max, @avg@hist_linear(agg, LOW, HIGH, WIDTH)

Page 32: SystemTap SystemTap

Crash Dump Analysis 2014/2015 32SystemTap

Language: QuirksLanguage: Quirks

Often quirky syntactic shortcuts@entry(expression): saves expression value in entry probe for usage in return probeforeach(v = [i,j] in array)

Read language reference...

Page 33: SystemTap SystemTap

Crash Dump Analysis 2014/2015 33SystemTap

TapsetsTapsets

Tapset = “Library”Collection of SystemTap functions and aliasesEncapsulates internals via aliasesProvides convenience functions

Example: /usr/share/systemtap/tapset/linux/syscalls2.stp

Page 34: SystemTap SystemTap

Crash Dump Analysis 2014/2015 34SystemTap

UsageUsage

Usage

Page 35: SystemTap SystemTap

Crash Dump Analysis 2014/2015 35SystemTap

Using SystemTapUsing SystemTap

Write a scriptOne-shot script executionLong-term result collectionFlight-recorder mode

Page 36: SystemTap SystemTap

Crash Dump Analysis 2014/2015 36SystemTap

SystemTap passesSystemTap passes

Pass 1: ParserPass 2: Process probes, data structures etc.Pass 3: Translate to C (cached)Pass 4: Compile as a kernel module (cached)

Pass 5: Insert module

Page 37: SystemTap SystemTap

Crash Dump Analysis 2014/2015 37SystemTap

SystemTap translator / driverSystemTap translator / driver

Command: stapHigh-level driver commandAll or some passesSystem-wide or targeted

Examples:$ stap -e 'probe kernel.trace(“kmem:*”)

$ stap script.stp

$ stap -p4 script.stp

$ stap (…) -c 'command'

$ stap (…) -x PID

Page 38: SystemTap SystemTap

Crash Dump Analysis 2014/2015 38SystemTap

SystemTap runtimeSystemTap runtime

Command: staprunLoads a SystemTap probe kernel modulestap = stap -p4 + staprun

Examples:$ staprun stap_ee1d4debf0add5c64f0b095_1991.ko

Page 39: SystemTap SystemTap

Crash Dump Analysis 2014/2015 39SystemTap

Permission modelPermission model

Privileged operationsTherefore, SystemTap needs either:

root useruser in groups: stapusr+stapdevuser in groups: stapusr+stapsysuser in groups: stapusr

Page 40: SystemTap SystemTap

Crash Dump Analysis 2014/2015 40SystemTap

Permission modelPermission model

stapusr + stapdevCan run SystemTap as if root

stapusr + stapsysCan run prebuilt, signed modulesCan run limited functionality scripts

Avoid damage to system

stapusrCan run prebuilt and/or signed modulesCan run limited functionality scripts

Disallow access to other user informationDisallow harming other user process performance

Page 41: SystemTap SystemTap

Crash Dump Analysis 2014/2015 41SystemTap

Flight recorder and SystemTap serviceFlight recorder and SystemTap service

Flight recorder$ stap -F …Load module and detach......or run staprun as a daemon

SystemTap service/etc/systemtap/script.d/script.stp/etc/systemtap/conf.d/script.conf$ service systemtap start script

Page 42: SystemTap SystemTap

Crash Dump Analysis 2014/2015 42SystemTap

Guru modeGuru mode

Fun, useful and a little bit perilous!Bypass security limitsEmbedded CWrite to context variablesUsage: hotfixes, hacks, showcases, fun

Example:$ stap -g -e 'probe syscall.kill {if (pid==$1) { $pid = pid() } }' PID

Page 43: SystemTap SystemTap

Crash Dump Analysis 2014/2015 43SystemTap

SystemTap compile serverSystemTap compile server

ServerHas SystemTap translator and module builderHas DWARF debuginfoBuilds and signs SystemTap modules

ClientHave just SystemTap client and runtime

Page 44: SystemTap SystemTap

Crash Dump Analysis 2014/2015 44SystemTap

Final wordsFinal words

Final words

Page 45: SystemTap SystemTap

Crash Dump Analysis 2014/2015 45SystemTap

SystemTap: Problem solving processSystemTap: Problem solving process

Determine involved componentsKernel, executables, libraries

Investigate possible probe pointsWhat events are interesting?What events carry interesting data?

Determine desired outputHistogram, specific knowledge, data set

Start with generic probes, then limit

Page 46: SystemTap SystemTap

Crash Dump Analysis 2014/2015 46SystemTap

SystemTap: PatternsSystemTap: Patterns

Event logger: simple output on eventEvent detector: complicated filtering in probesTop: data collection, printing in timer.msData collection: runs until cancel, end probe

Page 47: SystemTap SystemTap

Crash Dump Analysis 2014/2015 47SystemTap

SystemTap: ResourcesSystemTap: Resources

man pagesstap, stapprobes

Exampleshttps://sourceware.org/systemtap/examples/

Tutorials and articleshttps://sourceware.org/systemtap/wiki

Language, tapset referenceshttps://sourceware.org/systemtap/documentation.html