29
Host-Assisted Virtual Machine Tracing and Analysis Abderrahmane Benbachir Michel Dagenais Dec 7, 2017 École Polytechnique de Montréal Laboratoire DORSAL

Host-Assisted Virtual Machine Tracing and Analysis

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Host-Assisted Virtual Machine Tracing and Analysis

Host-AssistedVirtual Machine Tracing and

AnalysisAbderrahmane Benbachir

Michel Dagenais

Dec 7, 2017

École Polytechnique de Montréal

Laboratoire DORSAL

Page 2: Host-Assisted Virtual Machine Tracing and Analysis

Introduction

Hypertracing

Hypercall

Boot-up Analysis

Shared Memory

Virtualization Awareness

Future Work

Questions

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Agenda

Page 3: Host-Assisted Virtual Machine Tracing and Analysis

Monitoring Tools

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Lttng

FtracePerf

eBPF

SystemTapDtrace

Tracer Aggregator

Strace

Offloader

Sync Async

callback, serialize, write callback, compute, update

Page 4: Host-Assisted Virtual Machine Tracing and Analysis

What is hypertracing?Hypertracing = Offloading traces from guest to host (or vice versa)

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Host

Qemu

Virtual Machine

L0

L1Guest kernel

App

Hypervisor

tracer App

probesoffloader

Qemu

Nested Virtual Machine

Guest kernel

App

probes

L2

probes

buffer async

sync

tracer

tracerTrace

Synchronization

data

data

data

Page 5: Host-Assisted Virtual Machine Tracing and Analysis

Communication Channels

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Hypercall

VirtioVirtio-serial

Shared Memory

NetworkTCP/IP

Host

Qemu

Virtual Machine

kernel

tracer

offloader

probes

bufferasync

sync

data consumer

How to do hypertracing ?

Page 6: Host-Assisted Virtual Machine Tracing and Analysis

Hypercall Channel

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 7: Host-Assisted Virtual Machine Tracing and Analysis

Host Hypervisor

Hypercall in a nutshell

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Hardware

Qemu (layer not involved)

Virtual Machine

L0

L1

Context Switch overhead

Hypervisor overhead

kvm_entrykvm_exit

Page 8: Host-Assisted Virtual Machine Tracing and Analysis

Which tracer to use ? Host Tracers Overhead (hypercall)

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

baseline

Page 9: Host-Assisted Virtual Machine Tracing and Analysis

Event Aggregation Tracing hypercalls on host using Lttng

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

hypergraph_host: { nr = 1000, a0 = 0xffffffff89c2ce60, a1 = 0, a2 = 0, a3 =0, vcpu_id = 2, guest_rip = 94139651574911, exit_overhead = 85 }

kvm_x86_exit: { cpu_id = 0 }, { exit_reason = 18, guest_rip = 94139651574911, isa = 1... }

kvm_x86_hypercall: { cpu_id = 0 }, { nr = 1000, a0=0xffffffff89c2ce60, a1 = 0, a2 = 0, a3 = 0 }

kvm_x86_entry: { cpu_id = 0 }, { vcpu_id = 2 }

Page 10: Host-Assisted Virtual Machine Tracing and Analysis

Nested VMs

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 11: Host-Assisted Virtual Machine Tracing and Analysis

Nested Virtualization

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Hardware

Host Hypervisor

Qemu (not involved)

VM #1

VM #2

VM #3

VM #n-1

...

VM #n

...

...

Single level Two levels Three levels

L0

L1

L2

L3

Ln-1

Ln

Remove Exit Multiplication

Patch

Page 12: Host-Assisted Virtual Machine Tracing and Analysis

Micro-Benchmarks Hypercall Overhead

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Exit-handling code in the hypervisor is slower when run in L1 or L2 than the same code running in L0

Transition between L1, L2, ... Ln involve an exit to L0 and then an entry.

Hardware : Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz (x86_64)

EPT-on-EPT-on-EPT

x33

x25

[+ patch] [+ patch]

Page 13: Host-Assisted Virtual Machine Tracing and Analysis

Micro-Benchmarks (worst-case) System call

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Nested overhead

Lttng 18 %

Ftrace x2

Hypertracing + compress (2)

12.6 %

Perf 15 %

Hypertracing 3.6 %

Page 14: Host-Assisted Virtual Machine Tracing and Analysis

Event Compression sched_switch example

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

prio -> 8 bitsstate -> 8 bitspid_max -> 15 bitstime_delta -> 32 bits

sched_switch { prev_prio = 0, prev_tid = 0, next_tid = 10, prev_state = 0, next_prio = 0, prev_comm = "swapper/1", next_comm = "migration/1" }...

sched_switch { prev_prio = 0, prev_tid = 10, next_tid = 11, prev_state = 1, next_prio = 0, prev_comm = "migration/1", next_comm = "ksoftirqd/1" }...

sched_switch { prev_prio = 0, prev_tid = 11, next_tid = 12, prev_state = 1, next_prio = 0, prev_comm = "ksoftirqd/1", next_comm = "kworker/1:0" }...

sched_switch { prev_prio = 0, prev_tid = 12, next_tid = 0, prev_state = 1, next_prio = 0, prev_comm = "kworker/1:0", next_comm = "swapper/1" }...

sched_switch { prev_prio = 0, prev_tid = 0, next_tid = 10, prev_state = 0, next_prio = 0, prev_comm = "swapper/1", next_comm = "migration/1" }

hypercall

Time delta

Guest

Host

Page 15: Host-Assisted Virtual Machine Tracing and Analysis

Event Compression

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 16: Host-Assisted Virtual Machine Tracing and Analysis

Boot-up Analysis

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 17: Host-Assisted Virtual Machine Tracing and Analysis

Boot-up phases

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

run_init_process()pure_initcall()trace_init()

very early early main boot scripts/systemd

Kernel boot-up Userland boot-upExternal modulesForking processes

Page 18: Host-Assisted Virtual Machine Tracing and Analysis

Boot-up Sequence

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

startkernel

console

security

run_init_process()

18 ms 1 sec900 ms700 ms

rootfs

device*

late*

100 ms

early

pure

core*

postcore*

subsys*

arch*

fs*

50 ms 140 ms

Ftrace

Lttng

Userland Bootup+

External Modules

105 million events=> 9.5 GB

time

events

trace_init()

No tracer

We did a patch for that: ftrace: very early function tracing

Page 19: Host-Assisted Virtual Machine Tracing and Analysis

Boot-up level tracing overhead

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 20: Host-Assisted Virtual Machine Tracing and Analysis

Boot levels : Marker event

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 21: Host-Assisted Virtual Machine Tracing and Analysis

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Top 17 most called functions

70.84 %

Dynamic Analysis

Page 22: Host-Assisted Virtual Machine Tracing and Analysis

Dynamic Analysis

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Filters

Page 23: Host-Assisted Virtual Machine Tracing and Analysis

Host Tracing

Guest

Events Trace Size TimeTracer Optimization & Configuration

-

Baseline - - - 868.52 ms

Function graph Function entries & exits* 168 M Buffer size 22 secs

Hypergraph Function entries & exits - - 1min 08s

Lttng

Hypergraph

Function entries & exits 322 M 9.4 GB 1min 20s

Event Aggregation (Host opt.) 105 M 6.7 GB 57 secs

Event Compression (Guest opt.) 163 M 4.8 GB 40 secs

Aggr + Comp 52 M 3.4 GB 30 secs

Aggr + Comp + Filtering (Guest opt.) 3.2 M 207 MB 2.6 secs

Hypergraph+

Bootlevel

Aggr + Comp + Filtering

+

Boot level (Guest opt.)

early 20 K 1.4 MB 93 ms

pure 5 K 0.3 MB 2.6 ms

core 8 K 0.5 MB 4 ms

postcore 5 K 0.4 MB 2.7 ms

arch 32 K 2.1 MB 17 ms

subsys 115 K 7.5 MB 143 ms

fs 2.5 M 155 MB 1871 ms

rootfs 10 K 0.7 MB 6 ms

device 412 K 26 MB 410 ms

late 258 K 16 MB 87 ms

Boot-up Tracing

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Function tracing

Friendly advice : Always use function tracing with filters.

Enable it for specific cases

66% overhead

Page 24: Host-Assisted Virtual Machine Tracing and Analysis

Shared Memory Channel

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 25: Host-Assisted Virtual Machine Tracing and Analysis

Virtual Machine

Guest kernel

Qemu

Host

Zero Copy Shared Memory

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Zero copy using mmap

ConsumerApp

physical address space

virtual hostaddress space

Guest virtualaddress space

Guest physicaladdress space

mmap

IVShmem

Page 26: Host-Assisted Virtual Machine Tracing and Analysis

Virtualization awareness

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Guest

Host

Shared ring buffer

kvm_exit kvm_entry kvm_entrykvm_exit

Host to Guest offloading

Work in progress ...Synchronization ?

Page 27: Host-Assisted Virtual Machine Tracing and Analysis

Future Work

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Compare Shared M. vs Hypercall vs VirtioShutdown analysisVirtualization awareness

Page 28: Host-Assisted Virtual Machine Tracing and Analysis

Feedbacks & Questions

[email protected]

https://github.com/abenbachir

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Page 29: Host-Assisted Virtual Machine Tracing and Analysis

References

POLYTECHNIQUE MONTREAL – Abderrahmane Benbachir (Abder)

Hypercall Implementation : https://gist.github.com/abenbachir/344822b5ba9fc5ac384cdec3f087e018

QEMU Hypertrace Patches: http://patchwork.ozlabs.org/project/qemu-devel/list/?state=&q=Hypertrace&archive

Trampoline: https://www.kernel.org/doc/ols/2009/ols2009-pages-47-54.pdf

Callstack xml analysis: https://gist.github.com/abenbachir/e813790f183945b6f74dc74ecee57c75