Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
© 2010 IBM Corporation
IBM Corporation
Reaching Coverage Closure in Post-silicon Validation
Allon Adir,Amir Nahir, Avi Ziv IBM Haifa Research Lab
© 2010 IBM Corporation
Charles Meissner, John Schumann IBM Server and Technology Group
© 2010 IBM Corporation� Haifa Verification Conference 2010
Some Verification Realities
� Complexity of designs increasing rapidly– And complexity of verification is increasing even faster
�Many design have problems that require additional tape-outs, mostly due to functional bugs
�Design of high-end processors calls for more than one (planned) tape-out– Development of system and software– Tune-up of manufacturing process
�Functional verification of the actual silicon is (or should be) an integral part of the verification process
© 2010 IBM Corporation� Haifa Verification Conference 2010
Test-Template
Coverage Analysis Tool
Coverage Information
Random Stimuli
Generator TestTest
FailPassDUV
Simulator
Checking, Assertions
Coverage Reports
Verification Flow
� But …– Most of the compute time is spent in the environment– Silicon has very limited observability and controllability
in Post-Silicon
© 2010 IBM Corporation� Haifa Verification Conference 2010
Test-Template ExampleGenesys-Pro: Symmetric multi-processor test program generator
© 2010 IBM Corporation� Haifa Verification Conference 2010
From Test-Template to TestProcess 0,1:stw R2, 1000stw R11, 1000
…stw R3, 1000stw R2, 1000
Process 0,0:lwa 1000, R1stw R5, 1000stw R6, 1000
…lwa 1000, R15
Process 0,1:stw R1, 2500stw R9, 2500
…stw R9, 2500
Process 0,0:lwa 2500, R5 lwa 2500, R9lwa 2500, R7
…lwa 2500, R15
Process 0,1:stw R4, 6A00 stw R5, 6A00
…stw R1, 6A00
Process 0,0:lwa 6A00, R1 stw R9, 6A00stw R9, 6A00
…stw R12, 6A00
Process 0,1:stw R9, 1FF0 stw R8, 1FF0
…stw R9, 1FF0 stw R8, 1FF0
Process 0,0:lwa 1FF0, R3 stw R9, 1FF0 lwa 1FF0 , R9
…stw R12, 1FF0
Process 0,1:stw R1, BAA4stw R1, BAA4
…stw R7, BAA4stw R3, BAA4
Process 0,0:lwa BAA4, R3 lwa BAA4, R10lwa BAA4, R10
…stw R4, BAA4
Process 0,1:stw R2, C118stw R7, C118
…stw R7, C118
Process 0,0:lwa C118, R2 stw R1, C118lwa C118, R1
…stw R1, C118
Process 0,1:stw R4, 6A00 stw R5, 6A00
…stw R1, 6A00
Process 0,0:lwa 6A00, R1 stw R9, 6A00stw R9, 6A00
…stw R12, 6A00
Process 0,1:stw R9, 1FF0 stw R8, 1FF0
…stw R9, 1FF0 stw R8, 1FF0
Process 0,0:lwa 1FF0, R3 stw R9, 1FF0 lwa 1FF0 , R9
…stw R12, 1FF0
Process 0,1:stw R1, BAA4stw R1, BAA4
…stw R7, BAA4stw R3, BAA4
Process 0,0:lwa BAA4, R3 lwa BAA4, R10lwa BAA4, R10
…stw R4, BAA4
Process 0,1:stw R2, C118stw R7, C118
…stw R7, C118
Process 0,0:lwa C118, R2 stw R1, C118lwa C118, R1
…stw R1, C118
© 2010 IBM Corporation� Haifa Verification Conference 2010
Bridging the Pre- and Post-Silicon Gap
� We need a unified methodology for the pre- and post-silicon domains
– Share the goals using a common test-plan
– Easy share and transfer of information between the platforms
• E.g., for debug purposes
– Proven pre-silicon methodologies are a good starting point
� Each platform with its own solutions adapted to its characteristics
© 2010 IBM Corporation� Haifa Verification Conference 2010
Post-Silicon Coverage
� Coverage is one of the main means for monitoring the quality and progress of the verification process– Helps identifying areas in the DUV that are not verified or
lightly verified– Tracks and drives the progress of the verification process
� Coverage heavily relies on observing behaviors of the DUV– Which makes it difficult to implement in a post-silicon
environment
© 2010 IBM Corporation� Haifa Verification Conference 2010
Post-silicon Coverage – Possible Solutions
� Use in-silicon coverage monitors– Take advantage of existing monitors
• Performance monitors, etc.– Add dedicated coverage monitors
� Adding coverage monitors to silicon has negative effect on timing, power, area, …
� Unlike in-silicon checkers, they are not really useful in the field
�This solution is limited to a small number of really important coverage monitors
© 2010 IBM Corporation� Haifa Verification Conference 2010
Regression Suites
� Regression suites are sets of testcases or test-templates that are used periodically to ensure increasing verification quality– Interesting testcases and test-template are harvested– Harvesting is based on quality measures
• E.g., coverage� Two major types of regression suites
– Deterministic, based on testcases• Known properties • Hard to maintain, sensitive to changes
– Probabilistic, based on test-templates• Easier to maintain• Exact behavior is unknown
© 2010 IBM Corporation Haifa Verification Conference 2010
Regression Suites in Post-silicon
� Coverage of regression suites is a-priori known– Because the suites are built from previously executed
testcases or test-templates�No need to measure coverage of the suites
• Limited observability is no longer an issue
� Question: How do we know in advance the coverage of a given test / test-template?
� Answer: Execute the test / test-template on a platform that allows coverage measurement
© 2010 IBM Corporation Haifa Verification Conference 2010
“Guaranteed” Coverage Implementation
� Run post-silicon generation tools (exercisers) on a pre-silicon platform– Simulation is too slow to run exercisers
• Need faster execution platform – emulation, acceleration
� Collect coverage data and harvest interesting testcases or test-templates– For example, tests that contribute to coverage or tests that
reveal bugs
� Use the harvested tests / test-templates as (part of) post-silicon regression
© 2010 IBM Corporation� Haifa Verification Conference 2010
What to Harvest?
� Running the same test in simulation and silicon may not produce the exact same behavior– Slight differences in models, asynchronous interface behavior
� Need “smart harvesting”– Instead of harvesting specific tests, harvest templates that
provide non-negligent probability of hitting events– The large number of silicon cycles converts these
probabilities to almost certainty– Harvest specific tests only in special cases
• No proof of non-negligent probability – e.g., event hit just once in many cycles
• Test found an interesting bug
Assume
1. The execution speed ratio is 105
2. The probability of hitting the target coverage event in a 10-minute run is 0.1%
Then the probability of not hitting the event in 10 minutes on silicon is(1 - 1/1,000)100,000 � 4 x 10-44
© 2010 IBM Corporation� Haifa Verification Conference 2010
Regression Suite Algorithm
� For each event e and for each test-template t– If t covers e more than 3 times, Cov(e, t) = 1– Else, Cov (e, t) = 0
• 3 is a small constant used to avoid hitting events by chance• The resulting Cov matrix is a 0-1 coverage matrix of test-
templates
� Solve (deterministic) set cover problem – There are many known algorithms for that problem– A simple greedy algorithm works well
� Solution is a regression suite of test-templates
© 2010 IBM Corporation� Haifa Verification Conference 2010
POWER7 Implementation
� The design was partitioned to groups of related test-plan items– With a goal to create a separate regression suites for each
group� Test-templates assigned costs according to their
relevance to the group– Templates related to the group have the lowest cost
• These templates are identified a-priori by the bring-up team– Templates for other units have the highest cost
� The set cover algorithm minimizes the cost of the suite
© 2010 IBM Corporation� Haifa Verification Conference 2010
Post-silicon Coverage Platform
Test-Template
Random Stimuli
Generator TestTest
FailPassChecking,
AssertionsDUV
Simulator
Coverage Analysis Tool
Coverage Information
Coverage Reports
Silicon
Pre-silicon platformPost-silicon platformExercisers-on-Accelerator
Accelerator
Random Stimuli
Exerciser
© 2010 IBM Corporation� Haifa Verification Conference 2010
Exercisers-on-Accelerators (EoA)
� Use post-silicon tools (exercisers) on a pre-silicon platform (accelerators)– Note, this is not the only way to utilize accelerators
� Provide the post-silicon tools the benefits of the pre-silicon environment– Added observability and controllability
� Added benefits– Early validation and tune-up for post-silicon tools– Post-silicon tools contribute to the pre-silicon verification effort– Use EoA platform for recreation and analysis of bugs found on
silicon with the same exercisers– Really bridge the gap between pre- and post-silicon verification
© 2010 IBM Corporation� Haifa Verification Conference 2010
Results
� Coverage-driven EoA was used in the verification of the Power7 processor
� Encouraging pre-silicon results– EoA was fully integrated into the pre-silicon flow– EoA reached high level of coverage
• Almost as high as simulation– Coverage driven approach led to
• Finding many holes in the activation of the exercisers• Finding bugs in the exercisers• Finding bugs in the design
– Including some juicy ones that escaped simulation� Post-silicon regression suites speed-up bring-up
process and increased confidence in its quality
© 2010 IBM Corporation� Haifa Verification Conference 2010
POWER7 Core Coverage Results
98.06%88.70%92.78%96.18%Core Total
93.51%55.23%76.95%92.51%PC
98.66%85.32%91.04%94.33%LSU
99.58%90.20%98.15%97.44%FPU
99.85%85.85%84.72%99.60%FXU
97.42%92.78%92.49%96.48%ISU
98.65%94.99%96.77%96.79%IFU
TotalEoACore SimUnit SimUNIT
© 2010 IBM Corporation� Haifa Verification Conference 2010
Post-Silicon Regression for POWER7 LSU
0127686
10417555
005181114
01016223
0119192
00213261
OtherUnitDirectedDirected Templates
Coverage Events
Test Plan Item
Regression
© 2010 IBM Corporation� Haifa Verification Conference 2010
Summary
� Coverage is an important tool for assessing the quality and progress of the verification process
� Measuring coverage in post-silicon is difficult because of the limited observability
� We proposed a method for reaching coverage closure in post-silicon validation– Based on creating test-template regression suites in a pre-
silicon environment using post-silicon exercisers� Very good results from usage in POWER7
– EoA provided high verification value– Generated regression suites improved confidence in post-
silicon bring-up
© 2010 IBM Corporation� Haifa Verification Conference 2010