Stochastic Modeling for Random Testing Dvcon2009 Final 9009

Embed Size (px)

Citation preview

  • 8/14/2019 Stochastic Modeling for Random Testing Dvcon2009 Final 9009

    1/6

    Sample-path Coverage: Stochastic Model Application

    for Random Testing

    Victor Besyakov

    Tundra Semiconductor Corporation603 March Rd, Kanata, ON, CANADA, K2K 2M5

    ([email protected])

    Abstract This paper proposes an innovative ASIC verification

    coverage method, which measures the effectiveness and quality

    of random simulations. The Sample-Path Coverage (SPC)

    method is based on Design Under Test (DUT) statistical

    modeling. This method is comprised of two stages: collecting

    simulation statistics for multiple random tests, and using

    probability metrics for similarity assessment between

    corresponding runs. The benefits of SPC include optimized

    computer farm usage time during ASIC verification and

    increased random testing efficiency.

    I. INTRODUCTIONConstrained random-based verification methodology is the

    primary tool for the verification of large ASIC designs.However, when extensive large design verification is notfeasible, reasonable DUT state space coverage can beachieved by using appropriate constrained random testscenarios. In this case, the common practice is to measureverification effectiveness by functional coverage [1].

    The widely accepted constrained random-basedverification flow can be divided into three major steps:

    Step 1: Run random test Step 2: Check the functional/code coverage. When a

    desirable coverage has been achieved, verification isdeclared completed, (otherwise Step 3)

    Step 3: Adjust the random stimulus generators by usingconstraints and go to Step1

    As it was stated at [2] Part of problem with randomtesting is that the distribution of random events tends to followa traditional bell-shaped curve; adding more and more randomtests does little to guarantee that new states of the design arecovered. In fact, random tests typically exercise the samedesign space over and over again, with only slight variations.

    Although this problem can be partially solved by

    monitoring code and functional coverage, there are someissues related to constrained random verification that cannotbe resolved by using conventional methods. For example, thefollowing questions related to the test duration still remainopen:

    1. How long should a single random test run?2. Which type random test is better - long or short?3. If a random test is run longer, what is the

    probability of exploring the new design states?

    This paper proposes an innovative ASIC verificationcoverage method, called Sample-path Coverage (SPC) thataddresses these questions by measuring the effectiveness ofthe random simulations. The idea behind SPC is to

    approximate random stimulus generator outputs with anergodic Markov Chain (MC). For a long simulation run,ergodic MC will always converge to the unique stationarymode. This unique MC property enables the gathering ofinformation which shows what design states will be coveredwith higher probability and what design states will, possibly,not be covered at all. Also, MC representation opens upopportunities to apply probabilistic metrics that show howstatistically close two simulations are getting to each other.The SPC coverage information and probabilistic similaritymetrics are then used for random tests optimization.

    This article discusses similarity metrics based onKullbackLeibler divergence. We present them along withempirical statistical similarity criterion. With this analysis, it ispossible to get to decide either to continue the currentsimulation or to stop and generate a new, more effectivesimulation.

    The paper is organized into the following sections:

    The constrained random generator is introduced as abuffer queue data structure probabilistic model.

    Different probabilistic metrics are reviewed. Sample-path coverage framework and usage examples

    are presented.

    II. SAMPLE-PATH PROBABILISTIC MODEL OF THERANDOM GENERATOR

    Lets denote the output of the random generator as asequence of system events:

    , ,1 2 3,..., k E={e}=(e e e e ) .

  • 8/14/2019 Stochastic Modeling for Random Testing Dvcon2009 Final 9009

    2/6

    Fig. 1. Markov chain

    0

    -1k E

    n n+1

    w

    1. k 0 initialize system event counter

    n 0 initialize system clock counter

    C 0 initialize system time

    2. set system eventif k has been changed e F {U}

    n n+1 update clock counter

    C C update system time

    t D

    k

    w n

    k

    update waiting event timer(e )

    if t C if waiting timer is exparied

    e () execute system event

    k k+1 update event counter

    endif

    3. goto 2

    Fig. 2. Algorithm. Random generator as a dynamic probabilistic system.

    Where:

    {e}- is the sample path (events trajectory) characterized byits the cumulative distribution function F (cdf):

    F {e}=P(E e) .E

    ei is variable that represents an individual transaction.

    U is output of generic uniform random generator.

    F {U} is inverse cdf of E .-1

    E C is a system clock (system sampling time).

    When considering generator output as a one buffer queuedata structure, we can define conditions for system eventexecution as:

    n w w k C t , t D(e ) .

    Where tw is waiting queue time for event k.D( )

    ke is the

    operator that associates an individual system event withcorresponding waiting time.

    A simulation algorithm for stand-alone sample pathgenerations can be written as it shown at Fig. 2.

    Fig. 3. Sample-path Trajectory

    Fig. 4. Transition Probability Matrices for Two Simulations (Sim1,

    Sim2).

    Usually, the underlying process model of a discrete-eventstochastic system is generalized as a discrete, irreducible,ergodic Markov chain (MC) [14], where each state emits afixed system event (see Fig. 1. ).

    The sample path of the MC is fully characterized by theinitial states probability and by the one step transitionprobabilities organized as a transition probability matrix:

    1,1 1,2 1,3 1,n

    2,1 2,2 2,3 2,n

    n,1 n,1 n,n-1 n,n

    P P P ... P

    P P P ... PP= .

    ... ... ... ... ...

    P P ... P P

    The two figures illustrate the algorithm outcomes for twosimulation runs using the constrained system scenariogenerator described in [3]. Fig. 3. shows the sample-pathtrajectory, where the system event is equal to the transactionvariable (axis Y). The transaction variable combines into thesingle categorical representation three transaction parameters:transaction type (read, write... etc.), size of data payload, andtransaction destination. The second figure (Fig. 4. ) shows acolored transition probability matrix of the same sample-paths.

    One of the MC properties that make it useful for ASICfunctional verification is a stationary distribution (also calledan equilibrium distribution, or invariant measure). Generally,MC with a probability matrix P will converge in the long runsimulation to unique vector , which satisfies the linear system[13] [8]:

  • 8/14/2019 Stochastic Modeling for Random Testing Dvcon2009 Final 9009

    3/6

    TABLE I. MCMCDESCRIPTIVE STATISTIC

    Statistics J(P1) KLR(P1) J(P2) KLR(P2) J(P1,P2) KLR(P1,P2)

    Mean 0.002991 0.001495 0.003012 0.001506 1.966826 0.980818

    Standard Error 1.85E-05 9.25E-06 1.73E-05 8.66E-06 0.001156 0.000588

    Median 0.003001 0.001501 0.003032 0.001516 1.967595 0.980939

    Mode 0.002843 0.001488 0.003116 0.001475 N/A 0.972879

    Standard Deviation 0.000262 0.000131 0.000245 0.000123 0.016352 0.008314

    Range 0.00139 0.000695 0.00119 0.000595 0.082082 0.043575Min 0.002378 0.001189 0.002451 0.001226 1.931369 0.961709

    Max 0.003768 0.001884 0.003641 0.001821 2.013451 1.005284

    =P . Stationary distribution has a direct impact on the sample-

    path behavior. After equilibrium has been achieved, for thenon-uniform stationary distribution, the likelihood to have anon-repeated, unique sequence of sample-path elements startsto decline. As a result of that, the possibility of discoveringnew design space points by continuing the same simulationrun declines as well. Using this fact, it is possible to find theoptimal test length and decide when to stop the currentsimulation and start the new simulation.

    III. SAMPLE-PATH PROBABILISTIC METRICSProbabilistic metrics have been heavily studied, and

    successfully applied, in several areas such as queuing theory,finance, digital signal processing, and control theory[9][10][11].

    Among all the candidate metric functions, the Kullback-Leibler distance (information divergence, relative entropy) isgenerating considerable interest from a viewpoint of theMarkov process theory, and particularly, the Hidden MarkovModels theory [6]. According to the Kullback-Leiblermeasure, distance between two probability functions P1(x)and P2(x) calculates as:

    11 2 1

    2x

    P (x)KL(P ,P ) = P (x)ln .

    P (x)

    Kullback-Leibler distance is ideally suited as a distribution

    similarity estimator [7] for the following reasons:

    The KL-distance is a generalization of standard tests ofstatistical differences like the t-test, chi-square.

    It is an example of the f-divergence [12], one among aclass of distributional distances that include theHellinger distance and has various geometricinvariance properties.

    It has strong correlation with maximum likelihoodestimator and therefore can be used withinoptimization process.

    By definition, metric or distance (d) between twoprobability functions P1(x) and P2(x), is required to satisfy thefollowing conditions:

    1. d(P1(x), P2(x)) 0 (non-negativity)2. d(P1(x), P2(x)) = 0 if and only if P1(x)=P2(x)

    (reflexivity)

    3. d(P1(x), P2(x)) = d(P2(x), P1(x)) (symmetry)

    4. d(P1(x), P3(x)) d(P1(x), P2(x)) + d(P2(x), P3(x))(subadditivity / triangle inequality).

    The Kullback-Leibler distance itself cannot be called a realmetric between two distributions function, because it does notsatisfy the triangle inequality and it is not symmetric [5][6].

    The triangle inequality metric violation, leads to theconclusion that decreasing relative entropy does notnecessarily imply that two distributions are getting closertogether. In our case, this fact can be ignored, because theKullback-Leibler distance will not be used as a proximitymeasure of two distributions.

    The second metric violation is solved by usingsymmetrization that combines KL(P1, P2) and KL(P2, P1)distances into the one equation. The most popularsymmetrizations of the Kullback-Leibler measure are:

    By simple average called j-divergence [4] [5]1 2 2 1

    1 2

    KL(P ,P )+KL(P ,P )J(P ,P ) = .

    2(1)

    By the geometric mean [5]1 2 1 2 2 1G(P ,P ) = KL(P ,P )+KL(P ,P ) .

    By harmonic mean called resistor average [5][6]-1 -1 -1

    1 2 1 2 2 1KLR(P ,P )=(KL(P ,P ) +KL(P ,P ) ) . (2)It is also possible to find other types of symmetrization in

    literature. Some of these types may include optimizations stepsas an example [6]:

    min 1 2 1 2 2 1D (P ,P ) =min{KL(P ,P ),KL(P ,P )} . The previously mentioned j-divergence (1) and resistor

    average (2)give the best results when applied to the SPC. Thedescriptive statistics (see TABLE I. ) gathering over 1000MonteCarlo Markov Chain (MCMC) simulations for two

    distributions Sim1 (P1) and Sim2 (P2) strengthen this fact(refer to TABLE I. Standard Error and Range). Moreover, thesame descriptive statistics are allowed to define the empiricalbound for KullbackLeibler similarity measures as a 0.005threshold level (refer to TABLE I. Min, Max and Range).

    IV. SAMPLE-PATH COVERAGE FRAMEWORKIn contrast to the conventional metrics, which have

    absolute meaning on the corresponding coverage space (codecoverage, functional coverage), SPC is a relative metric.

  • 8/14/2019 Stochastic Modeling for Random Testing Dvcon2009 Final 9009

    4/6

    Fig. 5. MCMC for in process Sample Path Coverage.

    Fig. 6. Crossbar Switching Fabric DUT.

    It compares different tests by using stochastic equality as ameasure of similarity and MC as a stochastic approximation of

    the simulation process. The simplified workflow algorithm forSPC includes the following steps:

    Step1: Initialization

    Run the first simulation as long as possible.Suggested time constraints for a long simulation run (if

    transition probability matrix is unknown) are: maximumallowed time for overnight regression, maximum allowed sizeof the simulations log, or maximum allowed disc spaceutilization.

    Collect statistics from the identified random generators,compute transition probability matrix, and store it.

    Step 2: Next simulations run

    Run the next simulation

    Fig. 7. Indirect SPC trajectory. State - number of available input buffers,

    ai- arriving events (operatorA ), di -departure events (operator B ).

    Collect the statistics on the fly and compose partialsimulation transition matrices for defined time intervalor for certain numbers of transactions.

    Step 3: Make a comparison. All stored matrices vs. currenttransition matrix by using Kullback-Leibler measures

    Case 1: If metrics are rising above 0.005 thresholdlevel, then run this simulation as long as it possibleand at the end of simulation, save current transitionmatrix for the next comparison round (Fig. 5. c).

    Case 2: If the metrics are falling below the thresholdlevel (due to the statistical equality of the twosimulations), then stop this simulation and dismiss thecurrent transition matrix. The likelihood for this

    simulation exercising the new design states that havenot been covered last time is low. The actualexperiment results show it is possible to save up to60% of the machine cycles (Fig. 5. a, Fig. 5. b).

    Step 4: Repeat steps 2 to 4.

  • 8/14/2019 Stochastic Modeling for Random Testing Dvcon2009 Final 9009

    5/6

    0

    0

    1. n 0 initialize system clock counter

    k 0 initialize system event counter

    S 0 initially no arrivals, set up queue counter

    C 0 initialize system sampling time (system clock)

    2. n n+1 advance to next system clo

    n n+1

    k

    k k

    1 2 k n

    k n

    ck

    C C update system clock 3. if e () customer arrived

    then k k+1 advance to the next system event

    arrival, system state has been changedS A(e )

    4. e {e ,e ,...,e } for all arrivals at time C

    if t C

    n-1 n

    k

    n n

    - C condition for departure for all arrivals at time C

    t is a sojourn time of event k

    departure, system state has been changedthen S B(e )

    (no concurrent departures allowed)

    5. goto 2

    Fig. 8. Algorithm. Generalization of the internal DUT FSM as a dynamic probabilistic system.

    V. APPLICATION OF THE SAMPLE-PATH COVERAGEMETHOD TO THE INTRINSIC DUTFSM

    Applications of SPC method include, but are not limitedto, the assessment of testbench (TB) stimulus randomgenerator quality. Generally speaking, all FSM like TB orDUT components (FIFO, queues, etc) can be modeled by theMarkov process and consequently could be treated as sample-path random generators. The following example explains theapplication of the SPC method to the input buffer crossbarswitching fabric-oriented design, where buffer release processis taken as a sample-path random generator.

    Assume that the DUT (Fig. 6. ) operation starts when theincoming transaction submits a request for service to theswitching fabric end-point. If the input end-point queue is full,the transaction will be asked to retry until the free queueelement is available. When the transaction is accepted by theDUT, it places the transaction into the first available inputqueue entry. Then transaction will be sent forward through theswitching fabric to the appropriate end-point port. After therequest/response pair is completed, the transaction is removedfrom the queue, freeing the space for the next pending request.Using discrete event dynamic system terminology, we candescribe a pair of events trajectory and the system reaction onit, as a stochastic model (Fig. 8. ) The model is parameterized

    by:

    The state change process:

    n{S :n = 0,1,2...} . where n - is state space number

    It represents system reaction to the to the sample pathdefined by k system events

    1 2 kE=(e ,e ,...,e ) .

    And has state change process operators A(e) and B(e)

    S A(e) ,S B(e) .

    {Yt; t 0},the output process. It is defined by

    t n n n+1Y S ;T t T . It has the same state space as Sn and it connects the state

    changes to the time which they occur

    Assume that the output of the continuous time process Fig.8. is equal to the arrival-departure event counting process.Then:

    Operator

    nA(e ) becomes an input buffer counterdecrement function

    n+1 nA(e): S S 1 .

    Operator nB(e ) becomes an input buffer counterincrement function

    n+1 nB(e): S S 1 . +

    Fig. 7. shows an example of the sample path trajectory witha queue counter as an output process.

    The arrival/departure process is well approximated by thestationary ergodic semi-Markov process for queue size equalto m with embedded transition matrix (3) and sojourn timedistribution function (4) specified for each state Sn. [13]

    1,2 1,3 1,m

    2,1 2,3 2,m

    m,1 m,1 m,m-1

    0 P P ... P

    P 0 P ... PP= .

    ... ... ... ... ...

    P P ... P 0

    (3)

    n r nF(S ;t) = P{t } .C (4)

  • 8/14/2019 Stochastic Modeling for Random Testing Dvcon2009 Final 9009

    6/6

    nF(S ;t) is the probability that upon entering a state the

    model stays there (tr is a sojourn time) for at most Cn timeunits [14][15].

    The rest of the SPC flow is still the same:

    1. Run the initial MC simulation, compose and store theinitial transition matrix.

    2. Run the second simulation and compose the currenttransition matrix.

    3. Using probability metric, make a comparison: allstored matrices with current transition matrix.

    4. Save or discard the current transition matrices basedon 0.005 probabilistic metric threshold levels.

    5. Run the next simulation and repeat steps 2-4.Compare to the TB random generator, chosen intrinsic

    DUT process has the following advantages:

    Numbers of states for the MC transition matrices areless compared to TB random generator. As a result ofthat, it is more likely to achieve a stationary modeduring one simulation run.

    It is correlated with both input sample-path distributionfunction and with DUT functionality (Ref. to thesemi-Markov process definition (3) and(4).)

    It filters out simulations that have small impact on theactual DUT functionality. Digital design has atendency to be insensitive to the small sample-pathperturbation and what can look like a differentsample-path behavior (metric > 0.005) at the TBrandom generator output, can be the same in terms ofbuffer release process (metric < 0.005).

    VI. RESULTS AND CONCLUSIONSIn this paper, a new coverage metric approach named

    Sample-path Coverage (SPC) was introduced.

    This method has been applied (as a post simulationprocess) for host-bridge systems. We used two approaches toevaluate quality of the constrained random test suit. The firstwas based on the TB random generator as a SPC randomgenerator and the second was based on the buffer releaseprocess (similar to one that has been described in section I).

    Both approaches show that by using SPC methods, it ispossible to create a verification strategy, and determine:

    How long we need to run single test. How to save computer farm time and leave out the test

    that exercises the same design area over and overagain. Actual experiment results show that SPC savesup to 60% of machine cycles compared to thetraditional approach.

    ACKNOWLEDGMENTS

    Special thanks to Mr. David Shleifman for his numerousdiscussions, suggestions, and remarks and to Ms. Rhyno forthe proofreading.

    REFERENCES

    [1] A promising approach to overcome the verification gap of modern SoCdesigns.Whitepapers.Verisityhttp://www.verisity.com/resources/whitepaper/soc_nec.html

    [2] DAC 1999 Panel: Functional Verification - Real Users, Real Problems,Real Opportunities

    [3] Victor Besyakov, David Shleifman Constrained Random TestEnvironment for SoC Verification using VERA SNUG 2002 Boston.

    [4]

    Fuglede, B. Topsoe, F. Jensen-Shannon divergence and Hilbert spaceembedding

    [5] D. H. Johnson and S. Sinanovic. Symmetrizing the Kullback-Leiblerdistance. IEEE Transactions on Information Theory, 2001.

    [6] Hershey, J.R. Olsen, P.A. Rennie, S.J. Variational Kullback-Leiblerdivergence for Hidden Markov modelsAutomatic Speech Recognition& Understanding, 2007. ASRU.

    [7] Tamraparni Dasu, Shankar Krishnan, Suresh Venkatasubramanian, andKe Yi An Information-Theoretic Approach to Detecting Changes inMulti-Dimensional Data Streams 38th Symposium on the Interface ofStatistics, Computing Science, and Applications (Interface '06),Pasadena, CA, May, 2006.

    [8] Edited by Michiel Hazewinkel Encyclopaedia of Mathematics.http://eom.springer.de/s/s087260.htm

    [9] S T. Rachev Probability Metrics and the Stability of Stochastic Models.John Wiley and Sons, Chichester, 1991.

    [10] S.T. Rachev, F. J. Fabozzi, S.V. Stoyanov The Ideal Risk,Uncertainty, and Performance Measures John Wiley & Sons, 2008

    [11] Steeb. Hackensack Mathematical tools in signal processing with C++& Java simulations Willis-Hans, NJ, World Scientific, 2005.

    [12] F-divergence Wikipedia, the free encyclopediahttp://en.wikipedia.org/wiki/F-divergence

    [13] Rubinstein, Reuven Y., Kroese, Dirk P. The Cross-Entropy Method AUnified Approach to Combinatorial Optimization, Monte-CarloSimulation and Machine Learning Springer-Verlag New York, LLC,2004.

    [14] Markov chain Wikipedia, the free encyclopediahttp://en.wikipedia.org/wiki/Markov_chain

    [15] Edited by Michiel Hazewinkel Encyclopaedia of Mathematics.http://eom.springer.de/S/s084220.htm