42
PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

Embed Size (px)

Citation preview

Page 1: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

PinOS: A Programmable Framework forWhole-System Dynamic Instrumentation

Prashanth P. Bungale14th June 2007

Joint work with Chi-Keung Luk

Page 2: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

2

Outline

Pin Overview

PinOS motivation and goals

Architecture

Design Issues

Evaluation

Future work

Page 3: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

3

What is Pin?

A Dynamic Binary Instrumentation System Inject and delete instruction stream at run-time without source code

Programmable Instrumentation Provides APIs to write instrumentation tools (called PinTools) in C/C++

Multiplatform Supports 32-bit and 64-bit x86, Itanium Supports Linux, Windows, MacOS

Robust Instruments real-life and multithreaded applications

Database, search engines, web browsers

Increasingly Popular Over 10000 downloads since Pin was released in 2004 June

Page 4: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

4

Pin Instrumentation Uses

Computer Architecture Research– Branch predictor simulation– Cache simulation– Trace generation– Instruction Emulation

• E.g., emulate newly proposed instructions

Software Instrumentation– Profiling for optimization

• Basic block counts, edge counts– Bug checking

Page 5: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

5

PinOS Goals

Extend Pin to instrument OS code as well Programmable through extended Pintool API

Fine-grain instrumentation of both kernel- and user-level code No limitation on where and what kind of instrumentation can be inserted Not achievable by existing probe-based tools (e.g., Dtrace and Kprobe)

Only active when needed Attach/detach PinOS to/from the guest as and when needed

Generalized Infrastructure• Single framework to instrument Linux, Windows, etc.

Page 6: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

6

PinTool on PinOS: Tracing Memory WritesFILE * trace;

// Print a memory write recordVOID RecordMemWrite(VOID * ip, VOID * va, VOID * pa, UINT32 size) {

Host_fprintf(trace,"%p: W %p %p %d\n", ip, va, pa, size);}

// Called for every instructionVOID Instruction(INS ins, VOID *v) {

if (INS_IsMemoryWrite(ins)) INS_InsertCall(ins, IPOINT_BEFORE,

AFUNPTR(RecordMemWrite), IARG_INST_PTR, IARG_MEMORYWRITE_VA, IARG_MEMORYWRITE_PA,IARG_MEMORYWRITE_SIZE, IARG_END);

}

int main(int argc, char *argv[]) {PIN_Init(argc, argv);trace = Host_fopen("atrace.out", "w");INS_AddInstrumentFunction(Instruction, 0); PIN_StartProgram(); // Never returnsreturn 0;

}

Page 7: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

7

Architecture

Xen-Domain0

Host OS

Xen-DomainU

Xen Virtual Machine Monitor (VMM)

H a r d w a r e

Guest OS

PinOS

1

To run PinOS between guest and hardware: Use Xen

Virtualize and present a fake processor to the guest OS

1

2

2PinTool

I/O

Engine

CodeCache

Page 8: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

8

Xen 3.0 - A Convenient Environment

Uses Intel VT to run unmodified operating systems

Open-source availability

We modify Xen 3.0 to customize for PinOS purposes: Steal physical and virtual memory for PinOS Provide I/O services to PinOS Hijack initial control of guest domain Perform PinOS attach/detach

Provides support for debugging PinOS

Page 9: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

9

Stealing Physical Memory

Memory requirements PinOS exe, Pintool exe, Code Cache, PinOS stack, heap, I/O buffers

Physical Memory Pre-allocate a separate range of machine pages for PinOS

Machine Pages

Physical Pages

Page 10: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

10

Stealing Virtual Memory

Steal some portion of guest address space Current strategy: steal part of guest’s kernel address space

– Minimizes chance of VA space conflicts

Map stolen VA space to pre-allocated pages in Xen shadow

Propagate stealing to every shadow table i.e., in every address space ever encountered in the guest OS

Detect and report any conflicts No guest OS mapping activity encountered so far in stolen VA space Should be less of an issue with 64-bit address space

Page 11: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

11

Memory Virtualization

PiVi

……

P1V1

P0V0

MnVn

……

MK+1Vk+1

MkVk

……

MiVi

……

M1V1

M0V0

Page Table

Guest OS

Shadow Page Table

Xen

PinOSMemory

Page 12: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

12

I/O Services for PinOS

I/O Service requirements PinOS’s own debugging log, Pintools’ input/output

I/O channels implemented as shared ring buffers PinOS writes I/O requests to buffer shared b/w guest and host domains Daemon process in host domain periodically polls and processes requests

Sharing the ring buffers Allocated in guest domain “Mapped in” by host domain

Host Domain Guest Domain

PinOSDaemon Process

Page 13: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

13

PinOS Attach/Detach

Attach/Detach allows PinOS to be used only on subject execution Avoid overhead

e.g., can avoid PinOS being active during OS boot every time Precision / accuracy

PinOS on entire run may pollute instrumentation data collections

Implementing Attach Read entire state of guest machine Start PinOS activity from that point on Use VT support for reading and setting hidden register state

Attach Detach

PinOSNative Native

Page 14: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

14

Code-Cache Indexing and Sharing

Pin uses VA as code cache index

In PinOS, different processes can use same VA for different code Virtual address alone is not sufficient to distinguish code

Option 1: <AddressSpaceID, VirtualAddress> Easy to implement (On x86, use the CR3 value) But, no sharing of code across address spaces

Option 2: <PhysicalAddress, VirtualAddress> Can share code across address spaces Persistence across application runs But, much more challenging to implement

Page 15: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

15

Results on booting FC4-Linux

5340

840

0

1000

2000

3000

4000

5000

6000

AddressSpaceID PhysicalAddress

Exe

cuti

on

Tim

e (s

ecs)

<PhysicalAddress, VirtualAddress> is the Clear Winner!

Execution time Code cache space used

71

1538

0

200

400

600

800

1000

1200

1400

1600

1800

AddressSpaceID PhysicalAddressC

od

e ca

che

spac

e u

sed

(M

B)

Page 16: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

16

Correctness Issue with Trace Linking

<V1, P1>

jmp V2

<V2, P2>

Guest Code in Process A

<V1, P1>

jmp V2

<V2, P3>

Guest Code in Process B

V1’: Translation

of <V1, P1> jmp V2’

Code Cache

V2’: Translation

of <V2, P2>

Step 1:Process A is instrumented

and its translation is cached.

Step 2: Process B is instrumented and finds that <V1,P1> is already translated. So, no need to re-translate.

However, the jump to V2’ is incorrect because V2 is now mapped to P3 instead of P2!

Page 17: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

17

Code-Cache Indexing and Sharing

V2’:Translation of <V2, P2>

if (SoftTLB[V2] != P2){ // <V2,P2> is invalid.

call PinOS();

// Never return}

// <V2, P2> is still valid. //Execute the rest of the trace.

A Translated Trace in Code CacheSoftTLB

P3V2

P1V1

PAVA

Our solution: Check predicted page mapping against actual one at each trace entry Maintain “SoftTLB” that caches current guest page mappings Assign once and always use same TLB entry for a given VA->PA mapping

So that the trace entry check can involve a constant address lookup

Page 18: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

18

Coherence: Handling Page-Mapping Changes

Problem Guest’s page mappings may change after PinOS caches them in

SoftTLB

Solution Xen already marks guest page-table pages as read-only and thus

tracks all writes to them Modify Xen to inform PinOS once it figures out which page-table

entries get changed PinOS then invalidates these page mappings in its SoftTLB

Page 19: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

19

Interrupt/Exception Virtualization

PinOS virtualizes interrupts and exceptions: Maintaining control

Ex: Timer interrupt triggering process preemption Maintaining transparency

Ex: Guest interrupt handler attempting to identify thread ID based on ESP

Install own interrupt handlers in Interrupt Descriptor Table (IDT) So all interrupts and exceptions are routed through PinOS

Handling interrupts (asynchronous) When received by PinOS, put it on a queue Add a pending interrupts check at every trace entry Setup interrupted guest context with trace address and context Continue instrumentation at corresponding guest interrupt handler

Handling exceptions (synchronous) Recover excepting guest address and context and setup context Continue instrumentation at corresponding guest exception handler

Page 20: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

20

Exception Virtualization

Precise Exception Delivery In the face of “pseudo” instruction boundaries Log and Rollback all guest-visible state changes until most recent guest

instruction boundary

Faithful Exception Delivery While emulating instructions, conditions must be checked, and exceptions

raised as guaranteed by hardware semantics

movw %ds, (%edx)

call proc

spill %eax

movw M.%ds, %ax

movw %ax, (%edx)

restore %eax

pushl <current-eip>

jmp xlated-proc

Original Guest CodeTranslated Code

“Pseudo” Instruction boundary

Guest Instruction boundary

Page 21: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

21

Coherence: Handling Self-Modifying Code

Self-modifying code problem Content of a code page may change after Pin has cached that page

Write-monitoring Solution Standard page-table trick

Bookkeeping Maintain a reverse page-mapping table

i.e., a PA -> VA mapping table Upon bringing in code from given physical page:

Write-protect all virtual pages that ever map into this physical page

Page 22: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

22

Experiment Setup Environment:

Xen 3.0.2 running on Intel VT-enabled machines Guest domain installed with Fedora Core 4 Linux

Benchmarks: Fedora Core 4 Linux boot Apache-bench (web-server) Mysql-test (database server)

Pintools: Insmix

Code profiler that collects basic-block and instruction mix info CMP$im

Cache simulator that models a multi-level cache hierarchy Results in paper

Page 23: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

23

Distribution of Kernel and User-level Instructions

Page 24: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

24

0.32%105170776__might_sleep0xc011d565

0.13%45170776__might_sleep + 0x1a0xc011d57f

0.16%55170776__might_sleep + 0x2a0xc011d58f

0.38%610177398ext3_do_update_inode + 0x820xc8aac20b

1.17%293531291delay_pit + 0x1a0xc0111a40

Ins % Contribution

Num-InsCountBbl Symbol NameBbl Addr

Top 5 hottest kernel-level basic blocks of mysql-test-alter-table

Basic Block Count Results

Page 25: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

25

17777043RDTSC

801350CLTS

54923619INVLPG

4458207HLT

48403240INSW

31994311762IN

9824990104OUTSW

57181551209OUT

574204599646IRETD

8459212217286STI

28069918912950CLI

fc4-bootmysql-test-alter-table

Privileged Instruction

Insmix Results

NANAMOV DR

00WRMSR

20LLDT

20LIDT

NANAMOV CR

00LMSW

00RDPMC

150RDMSR

00WBINVD

00INVD

10LTR

20LGDT

fc4-bootmysql-test-alter-table

Privileged Instruction

Page 26: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

26

Performance of PinOS

Page 27: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

27

Related Work I

Dynamic Optimization Dynamo [2000], DynamoRIO [2003] Mojo [2000]

Software Dynamic Translation Strata [2003]

Dynamic Binary Analysis and Instrumentation Shade [1994] - SPARC & MIPS Walkabout [2002], Valgrind [2004] Pin [2005], HDTrans [2006]

Probe-based Dynamic Binary Instrumentation KernInst [1999], DynInst [2000], LTT [2000], DProbes [2001], KProbes [2004] DTrace [2004], SystemTap [2005]

Page 28: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

28

Related Work II

Full Machine Simulation/Emulation Embra (SimOS) [1996] – MIPS Simics [2002] Bochs [2002], QEmu [2005]

Para-Virtualization Denali [2002], Xen [2003]

Full Virtualization VMware [2002]

Hardware-assisted Virtualization Intel Virtualization Technology (VT) [2006] AMD Pacifica Technology [2006]

Page 29: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

29

Future Work

Make PinOS capable of instrumenting Windows

PinOS Infrastructure Support 64-bit support (x86_64) Multi-Processor support (MP)

Now that we have this powerful infrastructure, let’s write Pintools!

Interesting Pintools include debuggers, profilers, tracing tools, etc.

Plan to release to public Interesting users and uses may demand further enhancements

Page 30: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

30

Acknowledgments

Thanks to the entire Pin team For giving us a robust Pin to start with

Thanks to: Mark Charney

For helping us better understand Xed For fixing XED issues (only a few) very promptly

Greg Lueck For many helpful discussions, esp. about signals For fixing related bugs in mainline Pin

Prof. Jonathan Shapiro and Swaroop Sridhar For collaboration on initial ideas about segmentation virtualization

Page 31: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

31

Thank You!

Questions?

Page 32: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

32

Backup Slides…

Page 33: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

33

Virtualization of System-Level State

Segmentation Support

• Segment Registers

• GDT/LDT

Paging Support

• CR3 (PDBR)

• Page-table structures

Interrupt/Exception Delivery

• IDT

Task support

• TR

EFLAGS

• Including privileged bits like IF

Page 34: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

34

Review of IA-32 Memory Management

Page 35: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

35

Review of segment addressing

CS

DS

segment selector

segment selector

SS segment selector

ES segment selector

FS segment selector

GS segment selector

Segment Registers

segmentdescriptor

LDT GDT

segmentdescriptor

8K

En

trie

s

Courtesy: Gregory Lueck

Page 36: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

36

Review of segment addressing

index

Table indicator0 – GDT1 – LDT

Privilege info

Segment Selector

base address

limit other

Segment Descriptor

Courtesy: Gregory Lueck

Page 37: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

37

Review of segment addressing

index 1FS

base address

limit other

LDT

+

mov %fs:0x10, %eax

effective address

Courtesy: Gregory Lueck

Page 38: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

38

Hidden Part of Segment Register

index, GDT/LDT base, limit, acc. rights

visible part hidden part

Hidden part “cached” from LDT / GDT

Might be out-of-sync, software depends on this!

Saving segment register writes only visible part to memory

Restoring reads hidden part from GDT / LDT

Asymmetry: save / restore may change contents!

Courtesy: Gregory Lueck

Page 39: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

39

Irreversible Segmentation Problem

Instrumentation Engine

GDT

A0x10 B0x10

GDT

B0x10

GDT

Selector: 0x10

Desc. Cache: A

DS:

Selector: 0x10

Desc. Cache: A

DS:

Selector: 0x10

Desc. Cache: B

DS:

Guest Writes B into GDT[0x10]

Gratuitous Load performed by Instrumentation System

Wrong! Should still be A as the guest has not yet explicitly performed a load into DS!

Restore DS

Save DS

Page 40: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

40

Segmentation Virtualization

DS Register

Guest GDT/LDT

0x10:

PinOS GDT active on H/W

CS Desc. Cache

DS Desc. Cache

ES Desc. Cache

FS Desc. Cache

GS Desc. Cache

SS Desc. Cache

LDTR Desc. Cache

TR Desc. Cache

mov 0x10 -> dsIssued by guest

PinOS Stolen Entries

mov 0x2 -> dsIssued on hardware

&Emulated DS

Register updated with 0x10

Emulated DS Register

Key Insight: Just virtualize hardware descriptor caches Don’t virtualize segmentation tables GDT/LDT at all!

As and when guest explicitly loads hardware registers: Copy guest segment descriptors into corresponding caches Issue hardware register load instructions with modified selector Use dynamic translation for doing this

Page 41: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

41

Irreversible Segmentation Problem Solved

Instrumentation Engine

GDT

A0x10 B0x10

GDT

B0x10

GDT

Selector: 0x2

Desc. Cache: A

DS:

Selector: 0x2

Desc. Cache: A

DS:

Selector: 0x2

Desc. Cache: A

DS:

Guest Writes B into GDT[0x10]

Gratuitous Load performed by Instrumentation System

Correct!

Restore DS

Save DS

H/WGDT

A0x2

H/WGDT

A0x2

H/WGDT

A0x2

Selector: 0x10

Emulated DS:

Selector: 0x10

Emulated DS:

Selector: 0x10

Emulated DS:

Page 42: PinOS: A Programmable Framework for Whole-System Dynamic Instrumentation Prashanth P. Bungale 14 th June 2007 Joint work with Chi-Keung Luk

42

Implications of Virtualization Scheme

Gratuitous loads now performed with cached descriptors Ensures preservation of guest-expected hardware semantics

Allows PinOS to easily steal rest of table for own descriptors

With this scheme, no need for tracking guest table writes!

However, need to tame/emulate all segmentation instructions lds/es/fs/gs/ss mov ds/es/fs/gs/ss, […] mov […], ds/es/fs/gs/ss pop ds/es/fs/gs/ss push ds/es/fs/gs/ss lgdt, sgdt lldt, sldt lar, lsl, verr, verw ltr, str, task gate transfer through interrupt Far jumps, calls and returns, iret, sysenter and sysexit Software interrupt: int n, into, int 3 Hardware interrupt / exception