26
Virtual Machine Introspection with Xen Tamas K Lengyel [email protected]

Virtual Machine Introspection with Xen

Embed Size (px)

Citation preview

Virtual Machine Introspection with

XenTamas K Lengyel

[email protected]

Virtual Machine Introspection

● Isolation

● Interpretation

● Interposition

Isolation

● From in-guest kernel/userspace• Provided by Xen

• Buggy emulation blurres the line

● From trusted computing base (TCB)• Possible via Xen Security Modules

• Move introspection system out from dom0!

Xen Security Modules (XSM)

● Usable since Xen 4.3 and Linux 3.8

● Disaggregate the TCB

● Available on both x86 and ARM

● Not enabled by default

Interpretation

● Reconstruct kernel/process state

● Use memory forensic techniques

● LibVMI – http://libvmi.com

00 00 00 00 9c 95 ba e0 7c b7 37 c1 6c 6f 6f 70

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 60 ae 27 de c0 4a 80 df

e4 95 ba e0 cc 4a 80 df c0 4a 80 df 6c b0 37 c1

40 35 8e df 03 00 00 00 07 00 00 00 5c c1 c3 e0

00 00 00 00 00 00 00 00 00 70 2a de 00 00 00 00

00 00 00 00 80 7f 33 de 50 c0 c3 e0 60 c0 c3 e0

02 00 00 00 68 c0 c3 e0 02 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 00 f0 c3 e0 00 00 00 00 00 a0 c3 e0

00 00 00 00 a5 26 00 00 00 00 00 00 3d 1e 00 00

00 00 00 00 02 00 00 00 90 96 ba e0 c8 8f 38 c1

d4 bf c3 e0 c8 c2 c3 e0 c8 c2 c3 e0 20 00 00 00

20 00 00 00 c8 c4 c3 e0 c8 c4 c3 e0 00 1c ba df

00 7f 33 de 00 00 00 00 58 ae 27 de 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00 00 00 00 a0 c2 c3 e0 a0 c2 c3 e0 30 84 99 de

b8 bd c3 e0 a8 79 3f fe 00 00 00 00 00 00 00 00

struct module

state list name mkobj modinfo_attrs version src_version holders_dir syms crcs num_syms ... ctors num_ctors

module_state = MODULE_STATE_LIVE

struct list_head

next prev

unsigned int = 0

void (*)() = NULL

char[60] = "loop"

struct module_kobject

kobj mod drivers_dir mp

Interposition

● Trap to Xen when something of interest happens within the guest• Enable optional hardware traps• CLTS, HLT, LGDT, LIDT, LLDT, LTR, SGDT, MOV from

CR3, MOV from CR8, MOV to CR0, MOV to CR3, MOV to CR4, MOV to CR8, MOV DR, MWAIT, INT3, INT0, MTF, etc..

• See full list in Intel SDM 3c 25.1.3

Interposition

● Change access permissions in EPT

● Trap violation into Xen

● R/W/X

● With some caveats

EPT caveats

“An EPT violation that occurs during as a result of execution of a read-modify-write operation sets bit 1 (data write). Whether it also sets bit 0 (data read) is implementation-specific and, for a given implementation, may differ for different kinds of read-modify-write operations.” - Intel SDM 3c

EPT caveats

● “Why can't the hardware report the true characteristics right away?” - Jan Beulich

● “when spec says so, there is a reason but I can't tell here. :-)” - Kevin Tian

● Well.. let's just mark all write volation as read violation too..

● Patched in Xen 4.5

EPT caveats

● Requires relaxing the EPT permissions

● Requires singlestepping the vCPU

● Many VMEXITs not shown in picture!

● Fixed for Xen 4.6

EPT caveats

● Race-condition if VM has multiple vCPU

● No solution for this problem prior to Xen 4.6

● New method introduced in Xen 4.6 that solves this: altp2m

altp2m

● Add support for multiple EPTs for second stage lookup!

● One table for “restricted view”

● One table for “normal view”

altp2m

● EPT pointer can be swapped in the VMCS

● No need to change EPT PTE permissions all the time

● No race condition

Interposition

● Once trapped to Xen, forward events• Formerly known as mem_event

• Renamed and reworked as vm_event in 4.6

● Request/response via shared memory ring• Monitor page used for VMI related events

• Two additional pages: memory sharing and paging

vm_event & mem_access & monitor

● Let's keep track of subsystem names

● vm_event is the underlying request/response mechanism

● mem_access memops control EPT

● monitor_op domctls control all other optional VM execution traps

Event delivery structures in 4.6

● Defined in xen/vm_event.h public header

● Easily extendable and versioned

● No more hackery

● Event response can trigger specific behavior without additional hypercalls• Trigger emulation, singlestepping, swap altp2m...

VMI with Xen on ARM

● ARM has two-stage paging similar to EPT

● mem_access implemented for 4.6

● Some caveats:• No singlestepping?

• Can be worked around but it's a pain

• Split-TLB ambiguities

ARM mem_access

● ARM PTEs have fewer software programmable bits as compared to EPT

● ARM mem_access requires maintaining a Radix-tree to keep track of PTEs with custom permissions

● Radix-tree keyed with GPA

ARM mem_access

● For a 2nd stage violation ARM provides the faulting GVA

● GPA only provided if fault happened during 1st stage pagetable walk

● Xen needs to translate GVA to GPA to perform Radix-tree lookup

ARM mem_access

● Native CPU instructions to perform GVA to GPA translation

● Performs lookup as data-fetch access

● What if we trapped an instruction-fetch access?• In-guest translation hits iTLB

• Xen hits dTLB

● Split-TLB is a real rootkit problem• ShadowWalker, MoRE, etc..

● Guest can load the iTLB with rootkit page and dTLB with benign page

● Flushing the TLB does not help, iTLB translation may be lost if PT no longer represents the cached translation

ARM Split-TLB problem

● Execution tracing with mem_access may be problematic

● Use Secure Monitor Call (SMC) instruction injection!

● Similar to 0xCC injection on x86

● TODO

ARM future work

● altp2m is primarily designed to be used with Intel #VE

● VMCALL instruction to perform EPTP switching from the guest

● Hybrid VMI

● KVM events

x86 future work

● Why aren't we using git pulls?• Patches in mailinglist without branch-off point

specified

• Carving patches from mbox is a pain

• Start providing a public git branch for your series!!

Lessons learnt

● Provide build-testing for the community• It's a waste of time to wait for review on

something that's broken

• Check for style issues automatically?

• Travis-CI is OK but can time-out on large series

• https://github.com/tklengyel/xen/tree/travis

Lessons learnt

Thanks!

Tamas K Lengyel [email protected] [email protected] @tklengyel

LibVMI http://libvmi.comDRAKVUF http://drakvuf.com