Upload
keely
View
28
Download
1
Embed Size (px)
DESCRIPTION
Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. C.K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V.J. Reddi, K. Hazelwood Presented by: Michael Laurenzano. What is Program Instrumentation?. - PowerPoint PPT Presentation
Citation preview
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation
C.K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V.J. Reddi, K. Hazelwood
Presented by: Michael Laurenzano
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
What is Program Instrumentation?
• Inserting extra code into an application to observe its behavior– Example: Cache Simulation
for (int i = 0; i < LENGTH; i++) {
CacheSim(&A[i]); A[i] = (double)i;
CacheSim(&B[i]); B[i] = (double)i;
CacheSim(&C[i]); C[i] = (double)i; }
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Uses of Program Instrumentation
• Code Profiles– Basic block/Instruction count– Operation results
• Microarchitectural study– Branch outcomes– Memory addresses
• Bug checking– Memory leaks– Uninitialized data
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin System Layout
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin System Layout
The code beinganalyzed
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin System Layout
The code beinganalyzed
Tells us where and how
to perform analysis
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin System Layout
The code beinganalyzed
Tells us where and how
to perform analysisCombines applicationand pintool code tocreate instrumentedcode
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin System Layout
The code beinganalyzed
Tells us where and how
to perform analysisCombines applicationand pintool code tocreate instrumentedcode
Stores theInstrumented codecreated by the JIT
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Pin System Layout
The code beinganalyzed
Tells us where and how
to perform analysisCombines applicationand pintool code tocreate instrumentedcode
Stores theInstrumented codecreated by the JIT
Controls execution,maintains datastructures, tracksprogram state
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Simplified Instrumentation
• Transfer control to VM at an application control transfer
• Look for instrumented version of branch target in code cache– If found: execute instrumented code– If not: compile the code, insert into
code cache, execute new code
• Repeat
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Trace Linking• Transfer control directly between traces
– Branch target must be known statically– Target trace must be present in code cache
Sequence 1
Trace 1
Trace 2
Virtual Machine
Trace 1
Trace 2Sequence 2
Regular Execution
Pin w/o Trace Linking
Pin w/ Trace Linking
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Trace Linking (Indirect)• “Unknown” targets are usually
somewhat predictable– Function typically returns to a few
locations (few call sites)– Indirect Jump usually goes to a few
locations• Try several predicted targets to
see if we can avoid VM intervention– Short target lists are maintained for
each indirect branch– If we exhaust this list, use the VM
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Function Cloning• Most common indirect control transfer is
a function return• Create a function instance for each call
site– Return address is then unique and known
for each function instance– Turns this indirect control transfer into a
direct control transfer– Code bloat!
• Implemented by keeping a call stack for each instrumented instruction sequence– Keep last 4 in call stack– Call stack represented as a 64-bit integer
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Register Bindings• Register re-allocation occurs so that Pin
can use registers– The register bindings can be different from
one trace to the next
• When compiling, keep register bindings from the previous trace if possible
• When linking traces, modify the register bindings before going to the next trace– Usually only a few registers are mismatched
in practice
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Optimization – Inlined Analysis Routines
Without Inlining With Inlining
Application
Application
Bridge Routine
Bridge Routine
AnalysisRoutine
- 2 fewer calls and 2 fewer returns
Application
Bridge Code
Analysis Code
Bridge Code
Application
- Other optimizations: constantfolding, code relocation
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Optimization – eflags Register Liveness
• The x86 eflags register is treated as a bit-vector containing state information– This register can be modified as a
side-effect of some instructions
• eflags might not be live when we reach analysis routine– If this is the case, we do not need to
save/restore it
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Optimization – Call Scheduling
• User can specify that the routine be put anywhere in the particular scope – Anywhere in instruction, basic block,
function, program, etc.
• Pin can schedule the call according to best performance– Perhaps at a point where few
registers need to be saved– How well will this actually work?
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Basic Pin Overhead
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Effectiveness of Optimizations
San Diego Supercomputer Center
Performance Modeling and Characterization LabPMaC
Questions?