Upload
nguyenkhanh
View
216
Download
0
Embed Size (px)
Citation preview
Aggressive technology scaling and extreme chip integration significantly increase the complexity of microprocessor• Infeasible to exhaustively test when simulation
Pressure on validation team to deliver correct design to market on time is higher than ever.• 50% of microprocessor chips require extra unplanned
tape-out
3
Effective post-silicon validation needed to eliminate bugs before volume production
Random instruction tests(RIT) contribute tremendously to the detection of design bugs
5
Prototype to Volume Production
Challenges
Expected Responses
Actual Responses co
mpa
rato
r
Bug detected
Host Machine
Prototype
Actual Responses RIT
Simulation Limitations
Post-Processing (Triaging)
Blocking Bugs
6
Simulator
ISA DiversityConcept: Operations of an ISA can be performed equivalently in more than one different way.
7
Benefit: Same operation in different ways produce identical results but activate different logic paths
Application: Enable bug detection by comparing results of equivalent instructions (self-checking)
Statistics: In major ISAs, more than 75% instructions can be replaced with equivalent instructions
Example: MIPS ISA diversity
Original Instruction Equivalent Sequence Description
lw RA, addr(RB)Load word
lhu RA, addr(RB)lhu RC, (addr+2)(RB)sll RC, RC, 16or RA, RA, RC
Execute 2 load halfword unsigned instructions and places the second halfword to upper bytes.
8
Statistics
8%
15%
77%
ARM
11%
10%
79%
MIPS
8%
14%
78%
POWERPC
20%
6%
74%
X86
9
Full Equivalence
Partial Equivalence
No Equivalence
New Validation Method
ISA Diversity Database
Enhanced RIT
Hardware Replay
Mechanism
Post-processing
(triage)
New Validation Method
10
Contain equivalent Instructions for each instruction
RIT + ERIT (Equivalent RIT)+ checking code
Replace mismatch Instructions in RIT with Counterparts in ERIT
then replay
Data provided by HRM help
clustering of failure modes
Framework
Test Scenario
RandomInstruction Test
RandomInstruction Test
Generator
Enhanced RandomInstruction Test
GeneratorPrototype Triage
Debug
ISA DiversityDatabase
Enhanced log file
HardwareReplay
Mechanism
11
12
Framework (contd)
RIT
ERIT
CheckerBug
Detected
Prototype
Host machine
RITQuestion1:Why replay?
Question2:What if equivalent instruction is the offending one?
13
Enhanced RIT (Check Point)
InstructionSection 1store[1]
InstructionSection 2store[2]
……
store[k]
InstructionSection k
EquivalentInstructionSection 1estore[1]
EquivalentInstructionSection 2estore[2]
……
estore[k]
Equivalent InstructionSection k
CheckingCode
RIT ERIT
Mismatch? Check Point
14
Hardware Support
store-addr: buffer to store PC of all k store instructions.
estore-addr: buffer to store PC of all k estore instructions.
mids-queue: store every mismatch id ( 0~k ).
store counter: counts the number of stores for each run.
bypass control: control PC to bypass buggy code.
Stor
e-ad
drst
ore-
addr
esto
re-a
ddr
esto
re-a
ddr
bypass controlbypass control
store counterstore counter
hit
10
mids-queue
Mids from checking code (from a register)
monitor
25
130
0
Mismatch ids
15
Bypass Control
Run RITSC:0 -> mid-1 PC point to
Next inst ofestore[mid-1]
Until estore[mid] finish PC point to
store[mid]
“buggy” code in RITis bypassed
Remaining test still useful
16
Test Flow
Execute RIT, ERIT,Checking code
Update mids-queue
Mid == 0? N Mid > 0?
sc=0
replay(RIT,ERIT)
Y
sc==mid-1?
Y
PC=estore[mid-1]+4
Execute equivalent ops
PC=store[mid]
Execute checking code
Update mids-queue
Test PassedY N
StartFor all RITs:
17
Post-processing (triage)
Our MethodMismatch identifiers
stores’ addressDebug Engineer
Debug faster
Log information
Bypass buggy instructions
More bugs detectedwith each RIT
• PTLsim simulator: superscalar, OoO, single core x86-compatible• Bugs injected: 802 logic, 225 electrical, 1025 total• Original RIT: 154 RITS, ~4K Inst each, 616K Inst total• Comparison: Reversi & QED• Test Size: 2.4M Instructions for Reversi
1.8M Instructions for QED3.7M Instructions for Proposed
Experimental Environment
17
18
Injected Bugs Distribution
30%
52%
14%
4%
BUGS DISTRIBUTIONFetch/Decode Issue/Execute Retire Instruction&Data
Detected Bugs
19
0
200
400
600
800
1000
1200
Traditional RIT Reversi QED Proposed
detected bugs number
90.54% 88.10%
20.49%
100%
20
Validation Times
Time (Sec) Trad. RIT Reversi QED ProposedGeneration 4.460 6.310 5.530 7.680Simulation 51.000 - - -Execution 0.027 0.110 0.071 0.176
Total 55.487 6.420 5.601 7.856
Upload time, Download time and Compare time are nearly zero so neglected, But should be included for large tests.
Conclusion
Enhanced RIT
Support major ISA Fast self-checking
Hardware based replay
Bypass buggy InstFully utilize bug detect capability of RITs
Log information
Exact information on offending InstHelp locate root cause
AcceleratePost-siliconValidation!
22