Stochastic Simulation Tool for VLSI - projects-web.engr ...projects-web.engr.colostate.edu/ece-sr-design/AY16/VLSI/Senior...Using HSpice toolbox provided by CPPSim System Simulator,

Stochastic Simulation Tool for VLSI Final Report (continuing project), Spring 2017

By

Luis E. Martinez

Sergio Graniello

Christopher Chapin

Department of Electrical and Computer Engineering

Colorado State University Fort Collins, Colorado 80523

Project Supervisor – Dr. Sourajeet Roy

Approved by -

Abstract The word stochastic, defined as “synonymous to random…. pertaining to chance,” [1] is

a very powerful world when applied to any design problem. Randomness complicates any problem by adding an extra degree of complexity which requires careful reconsideration of even the most accurate models. Uncertainty rules the universe around us and it becomes greatly magnified in the nano and micro scales where even the smallest of deviations can have unprecedented impacts.

The stochastic nature of manufacturing electrical boards and components becomes most-obvious at the micro and nano scales, where even the slightest variation in the dimensions or properties of a component cause unacceptable output variations for circuits. Typically, the Monte Carlo algorithm is employed as a way to characterize the span of the possible variations within the output of a circuit with stochastic error. Being a sampling based method, Monte Carlo is able to deliver great, accurate results at the cost of time, with some of our results for simple amplifier circuits taking up to 8 hours to fully generate for 15,000 samples. While still the industry standard, Monte Carlo is obviously a less-than-optimal solution for analysing the statistical boundaries for a system. It is easy to see that the time cost of Monte Carlo will increase as the complexity of circuits and the amount of components with stochastic error within these systems increases.

Polynomial Chaos Expansions, a special condition within a Hilbert space, has been widely utilized in fields ranging from fluid dynamics to internal medicine to model the randomness within a system and predict the range of outputs that a system may yield. The polynomial expansions can be used to model any system with finite second moment and therefore, as we have shown with our results, are a strong candidate for modeling circuit systems with results that can be as good as those given by Monte Carlo. By modeling a low noise amplifier with a third degree polynomial expansion, our team was able to provide a mean and statistical boundaries for the circuit system that matched the results extracted using Monte Carlo up to 99% and doing so within just a few minutes, as opposed to the 6+ hour period that Monte Carlo required to deliver the same statistical moments.

With groundwork done and a solid algorithm to follow for using polynomial expansions to model circuit systems, scaling the algorithm to work for a large number of random parameters is our goal for the coming semesters. Scaling the problem up to VLSI implementations, while easy in theory, will provide many obstacles, primarily that of memory usage on a computer system and therefore additional computation time or the requirement of a more powerful machine. By creating additional algorithms to assess the importance of each degree of freedom in the overall statistical moments of a system, we hope to simplify computations for the computer and essentially generate a low-fidelity model of a complex circuit with statistical moments approximating those of the full polynomial expansion albeit with shorter computation time and lesser demand for memory.

1

Table of Contents

Abstract 1

Table of Contents 2

Chapter I – Introduction 3

Chapter II – The Monte Carlo Algorithm and Implementation 4

1. Objectives 4 2. Monte Carlo Analysis Method 5 3. Implementation of Monte Carlo 5 4. Monte Carlo Results for Low Noise Amplifier 7 5. Summary of Findings 8

Chapter III - Polynomial Chaos Expansions and Implementation 9

1. Objectives 9 2. Polynomial Chaos Model 9 3. Implementation of Polynomial Chaos Model 13 4. Polynomial Chaos Results for Low Noise Amplifier 13 5. Summary of Findings 15

Chapter IV – Dimension reduction and future work 15

1. Dimension Reduction Algorithms 15 2. Future Work 17

Chapter V - Conclusions 18

Appendix A – Abbreviations 19

Appendix B – Budget 19

Appendix C – Timeline 21

References 21

2

Chapter I - Introduction

Our education, whether we realize or not, revolves around the concept of determinism. Determinism, as defined by Merriam-Webster [1] is defined as a doctrine by which everything in existence is governed by discernible laws, which directly implies that everything in the universe can be accurately characterized and predicted given enough information and appropriate laws. While this can be helpful in simplifying concepts and ideas that need to be understood for other ideas to be built on, it often presents an overly-simplified and limiting perception of reality. As engineers, we are expected to make these kinds of simplifications in all types of different problems in order to come up with “good enough” approximations and solutions to problems we are faced.

Unfortunately, often times an in-depth analysis of a system is the only means of determining what we can consider good-enough and what needs to be improved. Such is the case in circuit design, especially in the hugely complex field of VLSI circuit design and optimization. As the scale for components in a system shrinks and the number of components increases (microprocessors have surpassed the tens of billion of transistor count as of the writing of this paper), the complexity of determining when an electrical system is good-enough comes increasingly more difficult. That is, in the real world, it is not good enough to deliver a solution that works only in optimal cases. It is crucial that engineers meet specifications in a best and worst case scenario, and as such it is important to be able to accurately and efficiently quantify the performance of a system as a function of the input error that is generated within manufacturing tolerances. It can be difficult to imagine just how much error exists in each system we create, but when one considers that our transistors are nearing the single-atom scale, it can be easy to imagine just how much variation can exist between individual transistors and therefore the larger components they make up.

As with any heavy computation problem, our computers excel at following sophisticated algorithms created to evaluate problems with different input parameters. One of these algorithms, Monte Carlo, has been used for decades now for modeling error propagation in circuit systems. Though an extremely reliable and foolproof algorithm, Monte Carlo requires huge amounts of samples to generate accurate statistical moments, as is shown by our characterization of the algorithm outlined in Chapter II in great detail.

Our project name, Tools for Stochastic VLSI Simulations, stems from the biggest challenge that electrical and computer engineers face today; the error propagation analysis of VLSI system, systems in which Monte Carlo simulations can be extremely taxing with regard to time and memory resources to even the most powerful supercomputers available. Polynomial Chaos shows promise in solving the problem with time cost, or at least providing a method by which engineers can approximate the statistical moments of a system to determine whether or not time and effort should be spent further investigating a system with more accurate algorithms.

3

The nature of and success we found within the polynomial chaos expansions is further explored in Chapter III, along with the abstract concept of the Hilbert Space and its spanning Hermite Polynomial Set from which our modified polynomial chaos theory is derived. In general, the goal of the team is to provide a concrete implementation of polynomial chaos expansions which provides insight into the following two concepts that our project will focus on:

1. Unlike Monte Carlo, Polynomial Chaos is not a sampling based algorithm. As such, it is reasonable to predict that the accuracy of the simulation using Polynomial Chaos will not be a function of the number of samples that we are able to produce. What this means for us is that Polynomial Chaos will have a lesser time and computation cost than Monte Carlo. This will especially be important in complex circuit analysis when a single Monte Carlo simulation may require several minutes to complete without taking into account that several thousands of these have to be completed. As circuit complexity increases and the number of Monte Carlo samples required increases Polynomial Chaos will likely hold a significant edge in terms of simulation time.

2. Because Polynomial Chaos relies on a polynomial expansion to simulate a circuit

we will end up with an easily tweakable circuit model. Monte Carlo, which relies on sampling, doesn’t allow for easy modification of a circuit that is being simulated.

Chapter II - The Monte Carlo Analysis Method

1. Objectives

The purpose of using the Monte Carlo method is to provide us with an accurate model that could be used as a test benchmark to Polynomial Chaos theory. With this, it is possible to study different types of circuits and see the level of accuracy and speed Polynomial Chaos can provide for random variations in comparison to Monte Carlo. The results computed using the Monte Carlo method were crucial to show the validity of Polynomial Chaos for predicting error propagation when modeling anything from simple filters to large arrays of transistors in modern day integrated circuits that rely on Very Large Scale Integration (VLSI).

Our work for the first half of the semester was mostly focused on creating test benchmarks using the Monte Carlo Method. By using Matlab and HSpice, we were able to create thousands of netlists for specific circuits and simulate them to output voltages at different nodes. Using HSpice toolbox provided by CPPSim System Simulator, it was possible to transfer the data outputted from HSpice to Matlab for statistical computations.

4

2. Monte Carlo Method

The Monte Carlo method is currently adopted as the industry standard and can be traced back to Enrico Fermi implementing the algorithm to calculate neutron diffusion in the 1930’s [3]. While the Monte Carlo algorithm then was not as developed or widely used as the algorithm we have today, Fermi’s work was considered to be the cornerstone of the modern-day Monte Carlo methods that are widespread throughout the industry.

In the early 1940’s, Stan Ulam developed an algorithm that is most similar to modern day Monte Carlo while playing a game of solitaire [3]. He attempted to calculate the likelihood of winning a game based on the layout presented to him. After struggling to use combination calculations, he formed an idea of using computers as tools to calculate the percent chance of winning the layout. [3] Ulam and his colleague John Von Neumann then took this idea and suggested using computer experiments based on chance to research aspects of nuclear physics that produced chaotic results. This project was thus known as the Monte Carlo, which is a reference to the casino in Monaco. [3]

Nick Metropolis, a designer for next generation computer controls at the time, became fascinated with the Monte Carlo method and how it handled stochastic modeling. His fascination led him to development of a computing system that was able to better handle the algorithms [3], thus modernizing it yet again. As a result, Monte Carlo began spreading outside of the field of physics and gambling and into other branches of science and modeling research.

The fascination and work done on Monte Carlo simulations led to Fermi, Ulam and Von Neuman to publish a paper in 1949, forming the modern sequential bases for Monte Carlo. After this, many papers on Monte Carlo began to appear in different fields and in 1953 [3], the first major Markov chain Monte Carlo (MCMC) paper was published, thus founding the modern algorithm that we are interested in studying. Monte Carlo, still today, continues to develop and grow in other fields such as modern statistics and artificial intelligence. In the last 20 years, the Monte Carlo method has become the golden standard to solve and produce statistical analysis in many fields including engineering, finance, physics, et-cetera due to its low level of complexity and high level of accuracy that it can achieve.

3. Implementation of Monte Carlo

The Monte Carlo method relies heavily on simulating a multitude of netlists created using fixed and variable, Gaussian distributed component values. To start our computation, we created a test netlist in a text file using the structure presented in the HSpice manual. HSpice then utilizes the input netlist to fill matrices that can be used to solve the system of equations that describe a circuit through LU decomposition and integration formulas. Netlists consist of columns of several key components that describe the position and values of circuit elements. For example, the leftmost column describes the type of element using a letter and a number to differentiate

5

between different of the same components. The next two columns describe the nodes between which the different elements are placed, which tells HSpice where to put the attributes of each element in each of the matrices. Our graduate and independent study students, Angela and Xiang, were responsible for working with HSpice and documentation in order to figure out how to model different components in netlists ranging from simple resistors to complicated PMOS models. Once their netlist for each circuit was tested and perfected we were responsible for writing a script to meet six functions

a. Generate samples from a Gaussian distribution to model different values of capacitance, inductance, etc. b. Generate a netlist in the format constructed using the different samples in their appropriate rows and columns c. Utilize Matlab to feed text commands to the Windows power-shell to operate separate instances of HSpice and run each simulation d. Utilize functions from the HSpice toolbox to extract node voltages from HSpice output files e. Compile statistics from the node voltages including the mean and variance f. Plot the mean and statistical bounds

Once we made a single netlist work correctly, we implemented the structure of the netlist into a Matlab script in order to produce multiple netlists with varying degrees of error. For the most part, coding the scripts was a simple process. Text files are an easy way to write netlists thanks to our ability to open and view them from any computer, which allowed us to test our code easily. Thankfully, Matlab allows for easy creating and modification of text files using the “fopen” function. The script uses a random number generator to create a number between -1 and 1 that has a normal distribution histogram. This number is then multiplied by the magnitude of each quantity that is to be varied and added to itself in order to create the value with its error. The value could then be added to a string where it belonged in the netlist and each of the strings could be compiled together in order. This process was repeated thousands of times each iteration of the script in order to generate each sample required to converge the results.

In order to be able to test each netlist generated we required use of the Windows powershell. By using the external command Matlab function we were able to open instances of HSpice and feed a netlist to each. HSpice would then simulate each circuit, generate an output file and close itself. For each simulation, we utilized HSpice toolbox to pull values from each output file and as a floating point integer and storing it in a vector. A pointer was set up to keep track of the location of each element of the vector in order to ensure that the data plotted corresponded to the appropriate output function of the circuit.

At the same time as the vectors were being generated for plotting, each node voltage was being taken into account for a mean and standard variation calculation given by the following formulas:

6

Once our information was stored into our results vector, we evaluated the mean plus 3 standard deviations and mean minus 3 standard deviations which tells us the statistical bounds of the circuit being simulated; values that represent the range of outputs that can be expected from the circuit with certain error parameters in its components.

4. Monte Carlo Results for the Low Noise Amplifier

The circuit we primarily focused on was a Low Noise amplifier with parameters given in Figure 2. The circuit had variability in ten out of the thirteen dimensions. As shown in the table, each variable has a different percent of variation with a transistor have a max of 20%. To run this simulation successfully, it was necessary to use a 10 nanosecond time interval with 5000 steps. The resolution was set to 10 picoseconds and the AC source was given 1 V amplitude and 1GHz in frequency. The resulting output graph can be seen in Figure 3.

The low-noise amplifier was fed with a sinusoidal signal which was appropriately

7

amplified by the simulation. The output file for the low noise amplifier shows the mean and the statistical bounds to the circuit. The lower boundary shows some clipping toward the peaks which would probably be due to railing the lower voltage.

5. Summary of Findings

The results produced by Monte Carlo are both accurate and effective. The biggest issue that was found when using Monte Carlo was the computational intensity. To put it into perspective, doing 15000 simulations on the LNA circuit took 8 hours. Our investigation as to the causes of the huge time cost of the simulations started in our VLSI algorithms class, ECE 442.

As we learned in class, circuit simulations are carried out by mathematical operations on matrices that contain the information of the circuit that we are interested in. Circuit elements are modeled using “stamps” that are derived from a modified nodal analysis. Using these equations, we are able to derive a differential equation for each circuit in the following forms.

The complexity of each system of equations is dependent on the number of elements used for each design, which helps explain why the LNA simulations from our results section took several hours. Because the LNA had a significant amount of elements the size of the matrices in the system of equations was approximately ten times the size of the matrices for the RC filter. This translates to a longer computation time in HSpice due to the amount of row operations that have to be carried out in order to triangulate each matrix. Furthermore, a circuit with high amounts of variation requires a larger amount of iterations in order to provide allow for the results to converge properly. This is where the biggest limitation of Monte Carlo exists. For extremely large circuits, as is the case in any modern VLSI circuit where transistor count alone can stretch well into the billions, the Monte Carlo algorithm represents an unbelievably high time cost. For the LNA circuit the HSpice computation time we experienced was just 2 seconds per iteration, resulting in just over 8 hours for a total of 15000 permutations of variability. Seeing as the computational cost for LU decomposition scales by the cube of the number of rows and columns in a square matrix, it is easy to see the massive issue that becomes computation time. Our LNA circuit had slightly over 10 nodes, meaning that our square matrices had 10 rows and columns. Doubling the amount of nodes to 20 by adding a few extra components increases the computational complexity by a factor of 8, meaning that it is entirely realistic for HSpice computation time for the hypothetical circuit would increase from 2 seconds to over 10 seconds per iteration. That change alone would increase the theoretical computation time to slightly over 40 hours. Having the number of nodes grow into the thousands for a mid-complexity integrated

8

circuit would increase simulation time by thousands.

While the computation of our statistical bounds was time consuming, it provided us with an accurate model. Upon developing an expansion for the LNA circuit with Hermite polynomials and plotting the solution to each of the polynomials for each circuit we will be able to compare the statistical bounds generated by each process and compare the two in terms of their accuracy and processing time.

Chapter III -The Polynomial Chaos Expansions and Implementation

1. Objectives

Our plan for the semester included two primary goals; to demonstrate polynomial chaos’ relevance in circuit simulations and to demonstrate that polynomial chaos has some advantages over traditional Monte Carlo simulations. Using results from last semester’s Monte Carlo analysis as a benchmark, our goal was to demonstrate that using our own Polynomial Chaos model, we could produce accurate results (compared to results generated with Monte Carlo) and statistical moments, which will effectively show by direct example that Polynomial Expansions are a valid solution to the time constraints and issues that Monte Carlo displays.

2. The Polynomial Chaos Model The general idea behind Polynomial Chaos Theory is in utilizing specialized polynomials that, given an appropriately scaled infinite expansion using Hermite basis functions for a basis in a Hilbert Space that scales to model randomness of a system with a finite second statistical moment, thus eliminating the need of a repeated sampling method such as that seen in Monte Carlo Analysis. The idea is very similar to a Fourier expansion, which models the behavior of a system in a frequency domain. That is, for any system there is an infinite sum of components that approximates the behavior of the same system in a different algebraic space. This algebraic transform-like property allows us to use different models for different vector spaces, with each model having its own advantages and disadvantages for different applications. Where normally an ideal output of the system is given as G( ) when ignoringt

9

any stochastic effects on the system, we can assume that there is a modified output G(t, ) that ξ factors in stochastic behavior in a system.. Using a separation of variables technique, G(t, ) can ξ be expressed as:

(t, ) ≈ G (t)ϕ (ξ) (t)ϕ (ξ) ... (t)ϕ (ξ) (t)ϕ (ξ)G ξ o o + G1 1 + + Gp p = ∑p

i=0Gi i

where are known basis functions for a Normal (Gaussian) Distribution taken from the (ξ) ϕi Polynomial Chaos Model, and are unknown coefficients to be solved by the model using a (t) Gi method of Linear Regression. Below a table of Hermite Polynomials are shown which are used as the Basis Functions for Random Variables of a Normal Distribution. The higher order k of a polynomial used to model randomness in the system, the greater the accuracy in the model outputs at the cost of a greater number of computations.

As the output of the model approximated by the sum, p → ∞

(t)ϕ (ξ)∑p

i=0Gi i

will fit the actual output exactly, but for our modified Polynomial Chaos Model, a finite (t, ) G ξ number terms are sufficient to accurately approximate the system coefficients = . For G (t) i X (t) i n Random Variables in a system, using polynomials of a degree k to model the randomness,

-1 terms in the sum are sufficient to get a good approximation of the actual systemp = n!k!(n+k)!

output. The values the Random Variables take to model the randomness of the system, known ξ as Quadrature Nodes, are derived from the Polynomial Chaos Model itself. Using the “classical” method of computation for Polynomial Chaos Theory, every Quadrature Node generated by the model is used in computing the system output, where the total number generated by the model is given by

10

k ) Q = ( + 1 n As the number of Random Variables n increase in the system, the total number of nodes

increases exponentially, making the classical Polynomial Chaos method too computationally expensive for larger problems with many degrees of freedom (as in the case of our LNA circuit, with n=10 degrees of error).

Our Modified Polynomial Chaos Theory instead uses machine learning by utilizing a Fedorov Search Algorithm to select an optimal subset of the total available Quadrature Nodes in the model to give a “good enough” approximation of the output in a fraction of the time it would take to go through and compute the output with every available node. With our modified model,

gives a sufficient number of nodes for an accurate output. For these N Quadrature(p ) N = 2 + 1 Nodes, for a given Random Variable then:, ξ , ... , ξξ(1) (2) (N )

(t, )| ≈ (t)ϕ (ξ )G ξ ξ=ξ(j) ∑p

i=0Gi i

(j)

running these calculations N times for each Quadrature Node and splitting up the different components of the equation into different matrices:

A: Basis Functions evaluated at the Quadrature Nodes (ξ ) ϕi(j)

x: System Coefficients (t) (t) Gi = xi B: System Outputs (t, )|G ξ ξ=ξ(j)

We obtain the matrix equation Ax = B, where the A and B matrices are known and the system coefficients x are solved using the method of Linear Regression

ϕ (ξ ) ... ϕ (ξ ) [ o(1)

p(1)

] x (t)[ 1 ] G(t, )[ ξ(1) ] . . .

. . = . . . .

ϕ (ξ ) ... ϕ (ξ ) [ o(N )

p(N )

] x (t)[ p ] G(t, )[ ξ(N ) ] A x B

Note that this matrix equation is for only a one dimensional case. The general method remains the same for a larger dimension problem, including our n=10 LNA circuit, but with larger matrices to accommodate the larger number of Quadrature Nodes and the larger combination of Basis Functions and Random Variables. The A matrix is obtained directly from the Basis Functions and Quadrature Nodes given by the model and calculated in Matlab. The B matrix values, for this simulation output voltages from our LNA circuit, are obtained running N

11

different HSPICE simulations for each Quadrature Node selected by the Machine Learning Algorithm. The values of the different circuit components are calculated using the equation

where is the calculated value of the circuit parameter of interest for the M S ξ P i = i +M i i P i given Quadrature Node, is the average expected value of the parameter, and is the amount M i Si of deviation allowed in the parameter.

Once A and B are obtained using Matlab and HSPICE, solving for the the system coefficients x using matrix algebra, we obtain the equation: as a means for A A) A Bx = ( T −1 T finding the system coefficients.

Once the coefficients of the system are found, they can be used to directly obtain the Mean and Standard Deviation of the system output, avoiding the repeated sampling method to obtain these values in the Monte Carlo analysis. The mean is obtained starting with the classic definition:

ean (t, )P (ξ)dξ ϕ (ξ)P (ξ)dξ ϕ P (ξ)dξ M = ∫

G ξ = ∑

3

i=0∫

xi i = ∑

3

i=0xi∫

ϕi o

Using the fact that the Basis Functions are orthogonal to each other and the definition of orthogonality, which states:

( ) )P( )d = 0 for ∫

ϕi ξ ( ϕj ξ ξ ξ =i / j

1 for i = j

The expression reduces to where is the first coefficient generated ϕ P (ξ)dξ ∑3

i=0xi ∫

ϕi o = xo xo

from the Linear Regression method in the x matrix. Similarly, the Variance (and Standard Deviation) of the output are obtained:

ar(G(t, )) (G(t, ) ean(G(t, )) P (ξ)dξ ( ϕ (ξ) ϕ (ξ)) P (ξ)dξ V ξ = ∫

ξ −M ξ 2 = ∫

∑3

i=0xi i − xo o

2

( ϕ (ξ)) P (ξ)dξ ϕ (ξ)x ϕ (ξ)P (ξ)dξ = ∫

∑3

i=1xi i

2 = ∑3

j=1∑3

i=1∫

xi i j j

Again, using the fact that the Basis Functions are orthogonal to each other, this reduces to

for m coefficients. Then.. = x12 + x2

2 + x32 + . + xm2

td. Dev.(G(t, )) S ξ = √V ar(G(t, ))ξ = √x ..12 + x2

2 + x32 + . + xm2

Each coefficient generated by the model is a function of time over whatever resolution chosen to analyze our circuit over (in the case of the LNA circuit, 5000ps), likewise allowing us to obtain

12

functions over time for the Mean and Standard Deviation of the output, putting the results in line with something you would expect from the Monte Carlo Analysis. The next couple sections detail these results obtained from the Polynomial Chaos Model by taking the function Mean(t) and Std. Dev(t) calculated from the coefficients and comparing them side by side with the results from the repeated sampling of Monte Carlo. 3. Implementing the Model in Matlab and HSPICE

Our modified version of Polynomial Chaos can be represented with the general differential equation Ax = B. To calculate our Basis Functions (A matrix), we used hermite polynomials, which are a class of polynomials orthogonal to weight functions. These function corresponds to a uniform distribution which thus can be obtained through the hermite polynomial generation function

(ξ) − )ke (e )Hk = ( 1 (ξ /2)2 dk

d ξk(−(ξ /2)2

Once the hermite polynomials are found, they are then evaluated at quadrature points selected by our machine learning algorithm and placed inside our A matrix. HSPICE solutions are obtained as a set of output voltages calculated for each simulation of the circuit for different values of the random variables, and then put into a B matrix. Finally, using the Linear Regression method, we solve for our system coefficients in the form of the x matrix using the matrix equation. Once we have our coefficients, we can solve for our mean and standard deviation as shown in section 2 of chapter 3.

To create our script, we used the same format that we used for Monte Carlo to create our netlists. The biggest changes were modifying the gaussian distribution function associated with each degree of freedom to the now calculated quadrature points. The other portion of that code that needed to be changed was our calculation for our mean and standard deviation from that of a repetitive sampling form of calculation to that of a PC coefficients based calculation.

4. Polynomial Chaos Results for Low Noise Amplifier

The Low Noise Amplifier used the same constraints for the Polynomial Chaos model as it

did for the Monte Carlo method as shown in Chapter 3 Section 4. The biggest difference was that instead of using a gaussian distribution resulting in 15000 iterations, we used 572 netlists created by implementing our calculated quadrature points. The results for our standard deviation and mean can be seen in the figures below. Each plot has two waveforms, one showing the Polynomial Chaos results and the other showing the Monte Carlo results.

13

As can be seen by the graphs, Polynomial Chaos produced results that were identical to

the Monte Carlo results with 1/180th the computation time. There were some slight differences in the PC and MC plots for standard deviations, but insignificant enough to not cause any issues. If for some reason this difference were to cause some issues, the coefficients can be computed at a higher dimension to increase accuracy even further. In turn though, this will lead to a slightly

14

longer computation time but overall will still be much faster than the Monte Carlo method.

5. Summary of Findings

Overall our results from the Polynomial Chaos Model compared to our work done in the Fall semester have been encouraging. As seen in the previous section, our results for the PC model converge very well with the verified, accurate results from Monte Carlo Analysis, and in a fraction of the computer time required by the latter since our PC Model doesn’t require a repeated sampling method to obtain Mean and Standard Deviation.

We have found with the 10 degrees of freedom in our Low Noise Amplifier that a 3rd order Hermite Polynomial models the randomness of the circuit very well, requiring 572 Quadrature Nodes and likewise, 572 HSPICE computations to fit the Random Variables to a Normal Distribution and produce accurate outputs from the model. Compared to the 15000+ HSPICE iterations needed to be run using the Monte Carlo Method for similar results, our new Polynomial Chaos Model is far superior in terms of computation time. As previously stated, there is some concern with how well our PC model will scale with an even larger circuit in terms of number of computations required and accuracy, but we still expect our modified PC model requiring only a small subset of the total amount of Quadrature Nodes generated, to scale well with a larger circuit. Additionally, our work on optimization of the code and Dimension Reduction Algorithms will carry into the next semester and should further help with scaling our model to a larger problem.

Chapter IV - Dimension Reduction and Future Work

1. Dimension Reduction Algorithms

The idea behind dimension reduction within our Polynomial Chaos model is to analyze which Random Variables within our system contribute the least to the variation of the circuit output, and then eliminate those variables entirely from the model and see how much our outputs of Mean and Standard Deviation are impacted. Ideally, all the variables with a low impact on the circuit output can be eliminated without greatly reducing the accuracy of the model while saving computation time in the process. As discussed in the previous chapter, a circuit with n random variables using a kth order polynomial as part of the model, requires

-1+1)=2( )(p ) ( N = 2 + 1 = 2 n!k!(n+k)!

n!k!(n+k)!

Quadrature Nodes in order for our Machine Learning Algorithm to obtain an accurate estimation of the system coefficients using Linear Regression.By reducing n using methods of Dimension

15

Reduction, the accuracy of the model output can remain roughly the same while greatly reducing the number of required nodes to use in computation.

In order to see which variables can be eliminated from the model, the overall variance of the output needs to be obtained the compared to the variance of each individual parameter using a method called ANOVA. However requiring the variance of the model output defeats the purpose of using dimension reduction to begin with because it will still require running the full system model in order to obtain the variance. As a compromise, a “low fidelity” version of the actual Polynomial Chaos model is derived that requires only a fraction of the time to compute as the full version, even for higher dimensions, and produces an accurate enough variance to use with the ANOVA method in comparing circuit variables. The idea behind generating the low fidelity model is in finding lower order terms that are only dependant on one Random Variable, which are a small fraction of the total amount of terms normally used in approximating the actual system output. In a higher dimensional problem, these “coupling terms” as they’re called can be dependant on two or more Random Variables in the system, but contain far less information about the system’s randomness compared to the terms dependant on one Random Variable. So, by eliminating these coupling terms from the system completely, we lose some accuracy in the model, but obtain the low fidelity model from the remaining terms that is good enough for our purposes of dimension reduction. Below, the approximation using lower order terms only is shown.

(t, ) ≈G (t) (t, ) .. (t, ) (t)ϕ (ξ ) G ξ o + G ξ1 + . + G ξn = ∑m

k=1xi(k)

(k)i

Where are the low fidelity coefficients we’re interested in solving for to analyze the (t) xi(k) variance of the system output.

Solving for these low fidelity coefficients, which are generated exclusively from the low order terms of the model, is simpler than using the method of Linear Regression with the

potentially huge matrix equations used in the full model.

(t) [G(t, ) (t)]ϕ (ξ )P (ξ )dξ ≈ G(t, )ϕ (ξ ) xi(k) = ∫

ξi − xo (k)

i i i ∑m

k=0wk ξi = rk

(k)i = rk

Where m is the order of polynomial used in the model (m=3 for our LNA simulations), are wk weights obtained for a one dimensional case from the model, and are Quadrature Nodes rk obtained for a 1D case where all other Random Variables in the circuit other than are set to ξi zero. is easily obtained as the kth order Basis Function evaluated at the pth root (1D (ξ ) ϕ(k)

i = rp Quadrature node, p=0...3, giving four roots for the 1D case), and then the last thing to obtain to solve for the coefficients, , is obtained from our HSPICE simulation passing off the (t, ) G ξi = rk LNA circuit while setting all Random Variables except for , to zero., , ξ , ... ξ1 ξ2 3 ξi

In total for the low fidelity model, 31 coefficients are generated (3 coefficients per for ξi i=1,...,10, plus ), as opposed to the hundreds that would be solved for using the full model. x (t)o

16

Once these coefficients are obtained, the variance comparison using ANOVA can be done. For each parameter in the circuit modeled by a Random Variable (10 for our LNA circuit), a sensitivity index is generated, a ratio between 0 and 1 showing how much the parameter contributes to the overall variation of the circuit output. Ideally, parameters with a low Sensitivity Index (close to 0), can be eliminated.

(t)Si = V ar[G(t,ξ]V ar[G(t,ξ )]]i = (x ) +...+(x )i

(1) 2i(m) 2

[(x ) +...(x ) ]∑n

i=1i(1) 2

i(m) 2

Giving a ratio of the variance of one Random Variable, , to the variance of all Random ξi Variables in the system. After the Sensitivity Indices are obtained, an average value over it’s entire time scale (t from 0ps to 5000ps for our LNA circuit) is taken as the final metric in N t = looking at the ith Random Variable’s contribution to the variance of the system.

(t ) 1N t

∑N t

k=1Si k = Ai

2. Future Work

Apart from putting the Dimension Reduction Algorithm into practice with a high dimensional problem, which is a work in progress, we have several other points of emphasis carrying forward that we would like to focus on. The first is improving the flexibility and modularity of our code implementing the Polynomial Chaos model in Matlab to seamlessly work with a larger number of circuits as well as making it so tweaking the model, such as adjusting the order of polynomial expansion used, is easier to do within the code itself.

Next, we are hoping to further optimize our Machine Learning Algorithm to reduce the amount of time needed to select optimal Quadrature Nodes in order to obtain an accurate output for our model. Currently with our 10 dimensional case the algorithm run time is very manageable (on the scale of several minutes) but as Polynomial Chaos is applied to larger problems, this time can rise dramatically.

Finally, we want to carry the project onto an even larger, more complex circuit than the LNA simulation used this semester, with a larger degree of freedom then the n=10 case we are working with now. Utilizing a combination of Dimension Reduction and our further optimized search algorithms, we hope to demonstrate that our Polynomial Chaos model is applicable to true VLSI applications analyzing circuits with 50 or more Random Variables with a high degree of accuracy and lower computation time when compared to the Monte Carlo Analysis.

17

Chapter V - Conclusion

The VLSI project team was able to make excellent progress over the spring semester; we were able to meet all of our requirements and deadlines while delivering work that met or exceeded Professor Roy’s expectations. Furthermore, we were able to organize and create materials to make next year’s team transition smooth so that they are able to continue research as efficiently as possible . The team, led by Luis Martinez, Sergio Graniello and Christopher Chapin on a timed rotation, was able to effectively learn and implement Polynomial Chaos expansions on our planned amplifier circuit and generated results that matched the accuracy of Monte Carlo better than we expected in just a fraction of the time.

We attribute the success in our project partially to learning the inner-working of circuit simulators in Professor Roy’s VLSI numerical algorithms course. By understanding how the simulators work behind the scenes and the computational intensity their different analysis algorithms use to solve a single sample using the Monte Carlo method, we were able to have a broader understanding of why it was important to minimize the number of samples being taken to get accurate results. While Polynomial Chaos has yet to be embraced in the Electrical Engineering community, we believe that the research being conducted provides promising results that will help establish polynomial chaos expansions as a valid method for modeling error propagation in circuits. We hope that, in time, the electrical engineering community will find a way to embrace and implement the algorithm in an industry setting to help optimize circuit modeling.

We hope to continue driving progress in research with our findings with next year's group, led by Christopher, by delving into reduction methods that will lower the computation time of Polynomial Chaos even further. By optimizing polynomial expansions, next year’s group will begin testing larger circuits that will provide more insight into whether or not polynomial chaos can be implemented into VLSI successful. Overall, we are extremely happy with the results of our project and the experience managing and teamwork experience we got from working together. We are hopeful that the project will find great success under Chris and any other future classes that may work on it.

Appendix A - Abbreviations

AC – Alternating Current

GHz – Giga Hertz

18

IC – Integrated Circuit

IEEE – Institute of Electrical and Electronics Engineers

LNA – Low Noise Amplifier

LU Decomposition – Lower / Upper matrix Decomposition

MCMC – Markov Chain Monte Carlo

MHz – Megahertz

MNA – Modified Nodal Analysis

NAND – Not-And Logic

PC – Polynomial Chaos

RC – Resistor-Capacitor Filter

V - Volts

VLSI – Very Large Scale Integration

Appendix B - Budget

Our budget was confined to the $200 granted per student. As the initial $400 were spent on HSpice licenses for labs, we were left with the additional $200 that our team was awarded as a result of having an additional student (Christopher Chapin) join our team. Since our team had already purchased everything we needed for the year with our initial $400 budget we were able to continue and finish the spring 2017 semester without any additional expenses. This means that we were able to meet our budget goals for the school year.

Given the nature of our project (being entirely programming and research based) there was no need for detailed spending plans or cost sheets. We were fortunate enough to be working on a project that required no financial planning and thanks to the money that was awarded to us from the department we were able to simply refund Professor Roy for the costs of software used in the project and his research.

Appendix C - Timelines

Spring - Fall 2017 timeline

19

● January 2017 - Theory for Polynomial Chaos Expansions Covered ● February 1st, 2017 - Project Resumes testing and development ● March 10th, 2017 - Polynomial Expansions for LNA circuit ● April 1st, 2017 - Comparison between PC and MC, plots and numbers ● April 14th, 2017 - Engineering Days Presentations ● April 10th, 2017 - Final Deliverables for Roy (Statistical Coefficients and weights) ● September 2017 - Refresh and background lectures for current state of project ● By September 30th, 2017 - Dimension reduction theory re-covered ● October 2017 - Low fidelity modeling for LNA ● November 2017 - Script modeling for higher dimensional case using low-fidelity models

and polynomial expansions ● December 2017 - Finalizing plots for winter presentation

Initial Project Timeline

20

References

[1] M.Webster, “Merriam Webster Online Dictionary,” [Online]. Available: https://www.merriam-webster.com/dictionary/stochastic. [Accessed 27 April 2017].

[2] I. Corporation, “New 7th Gen Intel Core Processor: Built for the Immersive Internet,” Intel Corporation, 30 August 2016. [Online]. Available:

21

https://newsroom.intel.com/editorials/new-7th-gen-intel- core-processor- built-immersive- internet/. [Accessed 27 April 2017].

[3] U. o. Lancaster, "Monte Carlo: A brief simulation," University of Lancaster, [Online]. Available: https://www.lancaster.ac.uk/pg/jamest/Group/intro2.html. [Accessed 26 April 2017].

[4] Manfredi, Paolo. High-Speed Interconnect Models with Stochastic Parameter Variability. Diss. POLITECNICO DI TORINO, 2013. Print.

[5] Oladyshkin, S. "Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion." Reliability Engineering & System Safety (2012): 179-90. Science Direct. Web. <http://www.sciencedirect.com/science/article/pii/S0951832012000853>.

22

Documents

Stochastic Simulation Tool for VLSI - projects-web.engr ...projects-web.engr.colostate.edu/ece-sr-design/AY16/VLSI/Senior...Using HSpice toolbox provided by CPPSim System Simulator,