22
DesignCon 2010 Enabling Early Design of Complex Power Delivery Networks Using Spatially-Nonuniform Finite- Difference Method Dr. Mosin Mondal, University of Washington [email protected] Dr. Bruce Archambeault, IBM [email protected] Prof. Vikram Jandhyala, University of Washington [email protected] EDA365

Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

  • Upload
    doandan

  • View
    226

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

DesignCon 2010

Enabling Early Design of Complex Power Delivery Networks Using Spatially-Nonuniform Finite-Difference Method

Dr. Mosin Mondal, University of Washington [email protected] Dr. Bruce Archambeault, IBM [email protected] Prof. Vikram Jandhyala, University of Washington [email protected]

EDA365

Page 2: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Abstract Early design and optimization of power delivery networks (PDNs) are becoming even more critical for complex beyond-die environments. For iterative early design of well-designed structures at moderate frequencies, transverse electromagnetic (TEM)-based 2.5D tools, such as the multilayered finite difference method (MFDM) [1, 2] are more suitable than 3D full-wave tools in terms of scale and speed while providing good accuracy. However, the application of the standard MFDM to complex structures is a challenge owing to the dual needs of fine refinement and overall efficiency. In this paper an enhanced MFDM using spatially nonuniform discretization is presented to overcome this challenge. Author(s) Biography Dr. Mosin Mondal received the Ph.D. degree in Electrical Engineering from the University of Washington in 2009. He received the M.S. degree from Rice University and the B.E. degree from Jadavpur University, India. He worked with Cadence Design Systems India Pvt. Ltd. (2001-2004) as a Member of the Technical Staff. He also worked with IBM, Research Triangle Park, NC, as an Engineering Co-op in 2007. His research interests include power and signal integrity, circuit-EM simulation and interconnect modeling. He is a recipient of the Best Paper Award in EPEP, 2007 and IEEE EMC Society President’s Memorial Award, 2008, 2009. Dr. Bruce Archambeault is an IBM Distinguished Engineer at IBM in Research Triangle Park, NC. He received his B.S.E.E degree from the University of New Hampshire in 1977 and his M.S.E.E degree from Northeastern University in 1981. He received his Ph. D. from the University of New Hampshire in 1997. His doctoral research was in the area of computational electromagnetics applied to real-world EMC problems. He is the author of the book “PCB Design for Real-World EMI Control” and the lead author of the book titled “EMI/EMC Computational Modeling Handbook”. Dr. Archambeault is an IEEE Fellow. Prof. Vikram Jandhyala is Associate Professor and Director of ACE Lab at UW EE. He is a recipient of an NSF CAREER award, a NASA inventor award, an outstanding research advisor award from UW EE, and graduate research awards from IEEE Microwave Society and the University of Illinois. He has published more than 150 papers and is founder of Physware, a venture-funded startup in electronic design automation. His research has been funded by DARPA, NSF, SRC, WRF, NASA, LLNL, DoD, SBIRs, and several industrial sponsors. His research interests include computational electromagnetics, integral equations, fast multilevel N-body methods, electronic design automation, signal and power integrity, EMI/EMC, multiphysics applications, synthesis and optimization, EM-circuit cosimulation, parallel and multicore algorithms, and field theoretic methods on graphs. He chairs the UW EE Professional Masters Program and regularly presents short courses in industry, conferences, and national labs.

EDA365

Page 3: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Introduction The need for rapid analysis techniques continues to grow with the increasing complexity of beyond-die power delivery networks (PDNs) since typically multiple design iterations are required for designing such complex PDNs. 3D full-wave tools, such as method of moments (MoM) [3], finite element method (FEM) [4], finite difference time domain (FDTD) [5] or the partial element equivalent circuit (PEEC) [6], are not amenable to fast design iterations because of their high computational expense. PDNs fall in the category of planar circuits [7] since the separation between the planes is much smaller than a wavelength. This structural property of PDNs was exploited to develop various fast techniques tuned to the niche application of PDN analysis, which include cavity resonance method [8], transmission line method [9], finite-difference method [10] etc. In particular, the multilayered finite-difference method (MFDM) [1, 2], which combines multiple layers with transmission line based circuit elements using a uniformly discretized Cartesian grid with square cells, is a powerful method. For reasonably well designed PDNs, MFDM is shown to have accuracy comparable to 3D full-wave solvers with significantly less computational time. However, the uniform square cells pose a severe limitation on the capacity of the method whenever fine and irregular features such as via pads, voids, cutouts etc. exist in the model. While creating a uniformly discretized mesh, the method is forced to use the finest feature size as the cell size resulting to high node count and consequently large time and memory requirement. Therefore, iterative early design and optimization become difficult for such structures with the uniformly discretized method. In this paper, we extend the already well-known MFDM by using a spatially nonuniform discretization scheme and subsequent equivalent circuit formulation to drastically reduce the node count. The method can handle irregular non-rectangular features such as triangles, holes etc. by using adaptive staircase approximations. Moreover, the sharp spatial variation of voltage around port locations, which is difficult to capture using conventional uniform grid unless extremely refined, is efficiently handled by recursive subdivision around the port locations. The method benefits from the use of coarse cells all over the power and ground planes except in areas with fine features or irregularities. As with the original MFDM, this method is capable of handling additional lumped circuit elements attached to the PDNs, such as decoupling capacitors, and can produce SPICE netlist for simulations in conjunction with other circuits.

The speedup offered by the nonuniform discretization will facilitate decoupling capacitor optimization, an important part of PDN design flow, which may require up to a few thousand iterations. To further increase the speed of iterations inside the optimization loop, a macromodeling technique is used. The macromodeling is based on Schur complements, as previously used for power grid analysis in [11, 12]. The combination of the nonuniform MFDM and macromodeling provides a powerful tool for PDN design, analysis and optimization, and shows the clear value of a well-designed TEM-based 2.5D tool as a useful complement to full-wave 3D simulation.

EDA365

Page 4: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

1. Nonuniform Discretization Nonuniform discretization is known for FDTD [18] and variable sized cells were proposed for PDNs in [20], though not in a generalized manner. A new discretization scheme for MFDM-like methods is proposed in this section with the aim to reduce the number of cells required to model a given PDN so as to achieve higher speed and greater capacity while maintaining the same level of accuracy. Cells of size λ/N, where λ is the smallest wavelength inside the dielectric medium corresponding to the highest simulation frequency and N is the number of sections per wavelength (typically 20), provides reasonably accurate results as long as the features of the model are not smaller than the cell size. However, features smaller than the cell size occur frequently in PDNs. For example, surface mounted discrete (SMD) decoupling capacitors are connected to the power planes through via-connectors of diameter Dv, typically of the order of 10 mils, as shown in Figure 1(a). Based on the original MFDM formulation [1], two possible uniform discretization schemes with square shaped cells are shown in Figure 1(b) and 1(c). The λ/N scheme shown in Figure 1(b) extends the via diameter from Dv to λ/N, whereas discretization using the finest feature size Dv (Figure 1(c)) preserves the via diameter, but increases the number of cells considerably. Similar problems occur when the size of the pins forming the PDN ports are smaller compared to λ/N.

To get an estimate of the required node count, consider a board level PDN consisting of a pair of planes measuring 10" × 8" separated by FR4 material of dielectric constant 4.0, with highest simulation frequency of 1 GHz. The cell size corresponding to λ/20 discretization will be 295 mils, requiring 952 cells altogether. However, since the via diameter and pin sizes are substantially smaller than the cell size, the discretization will be too coarse to represent the original structure. To accurately model the PDN, the original MFDM is forced to use cells of size equal to 10 mils all over the layout resulting to 800,000 cells.

Figure 1: Different discretization schemes for a PDN with a capacitor and via.

EDA365

Page 5: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

The proposed nonuniform discretization scheme will reduce the number of required nodes by local refinement near fine features over a coarse λ/N grid. This is depicted in Figure 1(d) where the cell containing the via-connection is locally refined, thereby maintaining the size and location of the via. Apart from the local refinement, the nonuniform grid also ensures that the boundary of any feature, such as cutouts and holes, is preserved by varying cell size at the coarse grid level. Thus, in the most general case, the coarse cells could assume any size to preserve model accuracy with the maximum cell size being λ/N. Figure 1(e) corresponds to the situation when the coarse cell is much larger than the via diameter, which happens in case of the example provided above (295 mil vs. 10 mil). It can be observed by experimentation with finely discretized model that a sharp spatial variation of node voltage occurs around the via-connections and port pins. To capture the effect of small pin/via dimensions, the port and via locations need to be surrounded by a few layers of cells of its own dimension for accurate results. A two-level recursive nonuniform discretization is used in Figure 1(e) where the via location is locally surrounded by n layers (n=2 in the figure) of similar cells. Corresponding to each via/pin location, (2n+1)2 + 7 additional cells will be introduced while the total no of cells will also depend on how the nonuniform discretization adapts the grid locations to fit the entire geometry. Compared to the fine uniform discretization scheme, the nonuniform discretization will require much lesser cells to provide similar accuracy. The algorithm for the nonuniform 2.5D Cartesian discretization is described as follows. The idea is to produce a layerwise consistent mesh. In other words, the same rectangle will be repeated vertically for all metal layers where the centroid of the rectangle does not fall within any hole. This can be achieved by making vertical cuts after laying out all layers on a single layer. For the set of rectangles, triangles and holes, first all the discontinuities along the x and y axis are found. Care should be taken if two rectangles can be merged to create a bigger rectangle, where the discontinuity can be disregarded. The minimum and maximum values of the discontinuities will represent the bounding box. For each discontinuity along an axis, the distance (d) with the next discontinuity is determined. If the distance is less than the maximum cell size specified by the user (Δ), no new grid location will be created. However, if the distance is more than the maximum cell size, then n new grid locations are created, where n is the smallest integer equal to or greater than d/Δ. At the junction of two different cell sizes widely different from each other (say, 5 mil and 100 mil), it is recommended to adapt the cell size with an intermediate cell size. After the first level (coarse) grid is constructed, fine features such as pins and vias are located within the coarse grid and the cell is subdivided locally following the same procedure.

2. Generalized Cell Model With the nonuniform discretization, the assumption of square cells [1] does not hold any more. Instead of a global cell size h, each cell is now characterized by two lengths lx and ly along the x and y axis, respectively. Therefore, the RLGC elements have to be computed for each cell. However, the cell parameters can be computed at the beginning of the analysis and the frequency dependent admittance matrix can be stamped using these parameters at each frequency of interest. Therefore, this step has insignificant impact on the performance.

EDA365

Page 6: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Figure 2(a) shows the general multilayered cell consisting of N+1 metal planes, numbered from 0 at the bottom to N at the top. The thickness of metal layer i is ti and the conductivity is denoted by σi. For any plane pair i, consisting of metal plane i-1 and i, the planes are separated by dielectric layer of thickness di with dielectric constant εi and loss tangent tan(δi). Corresponding to the N plane pairs, there will be N layers of RLGC elements stacked on top of each other, as shown in Figure 2(b). All inductance elements are defined with respect to the lowermost layer (``ground layer''). The resistance of the ground layer is also considered in the equivalent circuit. It should be borne in mind that R and L being the series elements between two cells will depend on the length of the cell along the x and y directions in case of nonuniform formulation. However, C and G elements will depend on the area. Assuming non-magnetic material, the parameters in RLGC layer i along x and y axes are given by the following simple transmission line based equations, similar to those described in [1]:

iiiy

xix

j

tl

lR

0

,

1

2 (1)

iiix

yiy

j

tl

lR

0

,

1

2 (2)

1,0, 2 ixiy

xix Ld

l

lL (3)

1,0, 2 iyix

yiy Ld

l

lL (4)

i

yxii d

llC

(5)

ii CG tan(i ) (6)

Figure 2: (a) Definition of a multilayered generalized cell (b) Circuit model for the i-th layer, inductance and resistance along y are not shown.

EDA365

Page 7: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

It can be noted that (3) and (4) are defined recursively. It is also important to note that plane pair i has mutual inductance equal to Lx/y,i-1 for 1i [1]. For the lowest layer, we have Lx,0 = Ly,0 = 0. The unit cell and equivalent circuit presented in [1] can be derived from this general case by substituting lx = ly = h. One important property of the multilayer formulation that makes the admittance matrix extremely sparse resulting in to high speed of solution is the manner in which the mutual coupling occurs. The mutual coupling appears only vertically when two cells are connected and there is no mutual coupling between two pairs of connected cells. Admittance matrix formulation for the vertical coupling in case of two connected cells is done using the indefinite admittance matrix in [1] without considering the resistances. A more general solution is presented below for the generalized nonuniform cell of Figure 1. Since the cells are nonuniform in the current context, the connection between two general cells is discussed first.

Figure 3 shows two general cells, Cell1 and Cell2, of sizes (lx1, ly1) and (lx2, ly2) connected along the x axis through their boundaries along y axis in all layers. For unequal ly1 and ly2, current from Cell1 flows to Cell2 through the common boundary of length, say, ly. Thus, the series impedance contributed by the cells need to be weighted. A simple approximation is made here: the impedance contribution of Cell1 will be weighted by ly1/ly, and Cell2 contribution by ly2/ly. Thus, for the general case, the series impedance of metal layer i is given by the following equation:

2,2,2

1,1,1

, ixixy

yixix

y

yix LjR

l

lLjR

l

lZ (7)

The method for filling up the admittance matrix due to the resistive and inductive (both self and mutual) elements between two cells is stated below. The elements under consideration are shown in Figure 4 along with node naming conventions and currents. The Rs and Ls terms in Figure 4 indicate the total resistive and inductive components of Zx,i in (7). Let us denote the currents and voltages through the nodes on Cell1 (Cell2) by the vectors I1 (I2) and V1 (V2), respectively, where

1,11,11,1 IIII NN and so on. The relationship between I1 and V1 considering the mutual

coupling is given by (8) below. Note that Rx,0 is added to each term in the matrix in (8) since the analysis is performed by making the lowest plane as the reference plane and current through each branch returns through the reference branch.

Figure 3: 3D Connection between two unequal cells along x axis.

EDA365

Page 8: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

111

1

0,

1,

2,

2,2,

2,1,1,

1,,

1

000

00

00

00

000

VYVR

Z

Z

ZLsj

LsjZLsj

LsjZ

I x

x

x

NxNx

NxNxNx

NxNx

(8)

By noting that Ii,1 = Ii,2 for ,1 Ni the admittance matrix Y12 for all nodes in Cell1 and Cell2 can be finally written as:

.11

1112

YY

YYY (9)

Simplification of this (8) in conjunction with (9) with square cells and ignoring the resistance terms will yield the equation described in [1,2]. It should be mentioned here that the computation cost for this formulation will not be a performance bottleneck since the size of the matrix will equal to the number of layers. For cell connection along y direction, simply replace x by y in (1)-(8).

3. Macromodeling for Decoupling Capacitor Optimization Optimal placement of decoupling capacitors [13,14] is an integral part of designing PDNs for high performance systems, which is accomplished by an optimization process. The number, locations and types of decoupling capacitors are updated in each iteration in the inner loop of the decoupling capacitor optimization routine whereas the rest of the PDN remains unchanged. While re-solving for the new configuration is straightforward, it will be extremely beneficial if the problem can be formulated and solved incrementally. This is indeed done using

Figure 4: Series connection elements between Cell1 and Cell2 connected along x axis. Nodes on Cell1 and Cell2 are marked by subscripts ‘1’ and ‘2’, respectively.

EDA365

Page 9: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

macromodels, as described in [15,16], where the macromodels are generated by model order reduction of large resistance-inductance/susceptance-capacitance (RLC/RKC) model obtained by extraction. In this work, a macromodeling technique around the nonuniform MFDM is presented for fast decoupling capacitor optimization. The macromodel is created using Schur complement [17] based formulation. Schur complement occurs naturally in problems decomposed into interconnected partitions and is well known for its application in inverting large matrices [17] and subsequently in solving systems of linear equations. Though not mentioned explicitly, the macormodeling technique presented in [11] is based on Schur complement computation. The macromodeling technique presented in this work in the context of MFDM is similar to that presented in [11] in the context of modified nodal analysis (MNA).

Figure 4(a) shows a schematic diagram of a PDN consisting of a number of decoupling capacitors and ports defined by a set of pins. Each pin and each capacitor-plane junction has a unique node number in the nonuniform MFDM analysis. The first step toward forming the macromodel is to partition the nodes into two parts, as shown in Figure 4(b). Partition2 contains all nodes corresponding to pins and junctions between capacitors and planes, whereas Partition1 contains rest of the nodes. Even though the port locations do not change during the decoupling optimization phase, pin-nodes have been put into Partition2 to club all nodes of interest to a single partition. Let the number of nodes in Partition1 and Partition2 be p and q, respectively, where p+q equals N, the total number of nodes in the whole system. MFDM directly forms an N×N admittance matrix Y for the overall model, which can be represented by a 2×2 block matrix after the partition is formed:

2221

1211

YY

YYY , (10)

where Y11, Y12, Y21, Y22 are p×p, p×q, q×p and q×q matrices, respectively. Formation of the partition ensures that Y12 and Y21 remain unchanged. According to the MFDM formulation, a current source is applied at a given port location and the resulting node voltage is calculated to obtain the port impedance. For m ports, the above computation needs to be repeated for each port, which can be represented in matrix form by YV = I, where I and V are N×m matrices

Figure 4: (a) PDN with capacitors and ports. (b) Nodes partitioned into two sets.

EDA365

Page 10: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

representing the m current and voltage vectors, respectively. If the current and voltage matrices

are represented in the block format as 21 III and 21 VVV , where ‘T’ denotes the transpose, then the following form is obtained:

2

1

2

1

2221

1211

I

I

V

V

YY

YY (11)

It can be derived from (11) that the voltage vector V2 for nodes in Partition2 is given by:

(12) ).()()( 11

2121

11

2121

121

1121222 1111IYYISIYYIYYYYV

In the above equation, S represents the Schur complement of Y11 in Y, denoted by (Y/Y11) [17]. Since no node representing a pin belongs to Partition1, V1 is of no interest for the purpose of evaluating port impedances making complete sense of including all pin-nodes in Partition2. For the evaluation of V2 at each iteration using (12), there are three steps: (i) formation of the q×q Schur complement matrix 12

1112122 YYYYS , (ii) inversion of the Schur complement, and

(iii) multiplication of the inverted Schur complement by 11

212 11IYYI . The matrix 1

121 11

IYY needs

to be computed only once at the beginning. Since Y11 is a large matrix, explicit inversion and multiplication with Y12 will be inefficient. The sparse matrix solver used in the core of the MFDM solver can be used, as has been used in [11], to solve the equation Y11X = Y12, where X =

121

11 YY . For step (iii) of evaluating V2, it can be noted that I1 is always zero since current sources are applied only at the ports and all pins forming the ports are included in Partition2. Therefore, step (iii) reduces to the multiplication of the inverted Schur complement with the sparse matrix I2. Effectively, the solution of N×N system is obtained by inverting a q×q matrix with N »q, where the effect of Partition1 is captured by the 1

121 11

IYY matrix.

A decoupling capacitor with its via connections is generally modeled using a series combination of effective series resistance (ESR), effective series inductance (ESL) and a capacitor. The inductance associated with the loop formed by the via connectors is lumped into the ESL value. When the number, location and types of capacitors are changed in each iteration, the admittance values of corresponding nodes are updated in Y22.

4. Numerical Results The nonuniform discretization based MFDM and the macromodeling technique described in the previous sections were implemented in C++ using an efficient sparse matrix solver library, PARDISO [19]. The Schur complement based macromodeling scheme, described in Section 3, is also implemented within the nonuniform MFDM framework. In this section, first the accuracy of the nonuniform discretization based MFDM is compared against measurement results. Next, computational benefits of the nonuniform MFDM are highlighted. Efficiency and memory requirement of the macromodeling technique is demonstrated next. The experiments are performed using one core of a dual-core 2.5 GHz Opteron machine running 32-bit GNU Linux.

EDA365

Page 11: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

4.1 Results for nonuniform MFDM Different PDN structures, ranging from simple rectangular plane pairs to complex-shaped structures, were analyzed using nonuniform MFDM method. Two of the structures were fabricated and impedance values were measured by Dr. Jingook Kim at Missouri University of Science and Technology.

The first of the fabricated PDNs is shown in Figure 5 with port and decoupling capacitance locations. The pins are of size 10mils×10mils, and the vias connecting the decoupling capacitors are of diameter 10 mils. Without the decoupling capacitors, the PDN is simulated for the frequency range 100kHz to 2GHz using 20 divisions per wavelength with adapted cells near the port locations, gap and the non-Cartesian boundary. A total number of 8056 nodes are created, requiring 180ms per frequency point and 20MB of peak memory. Simulated impedance values, both self- and transfer-, are compared against the measured values in Figure 7 (left column). Good agreement can be noticed for all three cases. For simulating the same PDN with 20 decoupling capacitors, additional refinement around the capacitors is required by the nonuniform MFDM, generating 11060 nodes. This simulation requires 250ms and 22MB peak memory. Comparison with measurement data for the PDN with capacitors is shown in the right column of Figure 7, which shows good agreement. Note that the conventional MFDM, with uniform discretization, will require cells of size 10mil all over the PDN requiring more than 1,400,000 nodes (with and without capacitors). This will require 121 seconds per frequency points and 2139MB of memory. The computational advantage of the nonuniform MFDM is evident.

Figure 5: First PDN structure with port and capacitor locations.

Figure 6: Second PDN structure with port and capacitor locations.

EDA365

Page 12: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

The other fabricated PDN, having the same pin and via sizes, is shown in Figure 6 with port and decoupling capacitance locations. For the same frequency range, and simulation setup, 4000

Figure 7: Comparison of impedances for PDN of Figure 5 without capacitors (left column) and with capacitors (right column).

EDA365

Page 13: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

nodes are generated that requires 80ms per frequency point and 12MB of peak memory. Comparisons with measurement data with and without capacitors are shown in Figure 8, demonstrating good matching. The computational requirement for the case with 18 capacitors are 100ms per frequency and 14MB corresponding to 6900 nodes. The uniform MFDM needs more than 500,000 nodes and thereby will require 30 seconds per frequency points and 822MB RAM. The savings in memory and runtime are considerable for this case, too.

Figure 8: Comparison of impedances for PDN of Figure 6 without capacitors (left column) and with capacitors (right column).

EDA365

Page 14: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

The other power plane configurations suggested for the special session on power plane analysis in DesignCon 2010 were also analyzed with nonuniform MFDM. Computational requirements and comparison with uniform MFDM for these configurations are shown in Table 1 for maximum frequency of 1GHz (except for 5 and 6, which are analyzed up to 2GHz):

Runtime per frequency (s) Memory requirement (MB) Config-uration Nonuniform Uniform Nonuniform Uniform

1A 30.6×10-3 12 1B 41.9×10-3 13 1C 56.5×10-3

100.1 15

1780

2A 40.7×10-3 13 2B 63.2×10-3 15 2C 84.2×10-3

155.6 16

2585

3A 74.3×10-3 15 3B 102.5×10-3 17 3C 144.7×10-3

19

Does not fit in 32 bit machines

4A 66.9×10-3 15 4B 678.5×10-3

45

Does not fit in 32 bit machines

5A 180.0×10-3 20 5B 250.2×10-3

121.1 22

2139

6A 80.4×10-3 12 6B 100.3×10-3

30.2 14

822

It can be seen from Table 1 that the nonuniform method not only improves the speed of MFDM, but also increases its capacity to handle large cases, as in configurations 3A-3C, 4A and 4B.

4.2 Results for macromodeling Figure 9 shows an example power bus with the port located at the center to demonstrate the effectiveness of the macromodeling technique. A symmetric grid of 80 capacitors is considered, as depicted in the figure. The model is analyzed for the frequency range 1MHz to 2GHz using 60 logarithmically spaced frequency points. First the PDN is analyzed with regular uniform MFDM using 800,000 nodes. Although the macromodel can be created for this discretization, it will be expensive and therefore the macromodel is generated using nonuniform MFDM. Figure 10 shows the comparison of the port impedance obtained from the three methods mentioned above. The coarse uniform MFDM uses 20 divisions per wavelength everywhere, thereby produces inaccurate results. As expected, the results obtained by macromodeling match exactly with nonuniform MFDM, which in turn agree with results produced by fine uniformly discretized MFDM. Computational requirements for this test case are shown in Table 2 for each of the above three methods. In this table, the total time required for the analysis at 60 frequencies is reported. The regular MFDM will require almost an hour to finish the evaluation. Even though the nonuniform method is much faster than conventional MFDM, use of macromodeling speeds it up even more

Table 1: Runtime and memory requirements of nonuniform and uniform MFDM.

EDA365

Page 15: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

consuming only 60ms for the entire problem, 1ms per frequency point. This will indeed provide the speed required in the core of an optimizer. It should be mentioned that there was an initial setup time of 111.12 seconds required for generating the macromodel. During the optimization, numerous configurations will be tested using the same macromodel where more than 15 iterations per second can be achieved for the above test case. Note that macromodeling alone produces almost 240X speedup over nonuniform MFDM, and combined with nonuniform MFDM the overall speedup over conventional MFDM is 59,000X.

Fine uniform MFDM Nonuniform MFDM Nonuniform and

macromodel Nodes Time Memory Nodes Time Memory Time Memory800000 3543.6s 1090MB 19935 14.34s 32MB 60ms 136MB

Table 2: Runtime and memory requirements of different schemes.

Figure 9: Example power bus with 80 decoupling capacitors.

Figure 10: Impedance plot for the power bus shown in Figure 9 using three different schemes.

EDA365

Page 16: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Conclusions In this paper, it was demonstrated that uniform discretization, unless extremely refined, can introduce significant error due to inaccuracy in modeling fine features. As a solution, a nonuniform discretization scheme was proposed to enhance the modeling accuracy within the multilayered finite-difference framework. An algorithm for producing nonuniform discretization and subsequent circuit element estimation were described in a generalized form. It was demonstrated that the finite-difference method along with the nonuniform discretization can produce accurate results with high computational speed and less memory. This also implies that the capacity of MFDM was enhanced to handle much bigger problems. A macromodeling technique using Schur complement-based formulation around the finite difference framework was also presented. The macromodeling technique provided an efficient way of incrementally analyzing the PDN with decoupling capacitors providing orders of magnitude speedup. Combination of nonuniform finite difference method and macromodeling provides a useful early design and optimization framework for PDNs. Appendix-I: Results for other test configurations Impedance profiles for the other configurations mentioned in Table 1, but not shown in Section 4, are shown in this appendix. Each configuration is analyzed for five different values of dielectric thickness, td, in mils. For all configurations, the layout, |Z11|, |Z12| and |Z22| are shown. All capacitors are 0.1μF, with ESR=30mΩ and ESL=1.3nH (including connection inductance above the planes).

Case 1A

EDA365

Page 17: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Case 1B

Case 1C

EDA365

Page 18: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Case 2A

Case 2B

EDA365

Page 19: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Case 2C

Case 3A

EDA365

Page 20: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Case 3B

Case 3C

EDA365

Page 21: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

Case 4A

Case 4B

EDA365

Page 22: Enabling Early Design of Complex Power Delivery Networks ...bbs.hwrf.com.cn/downpcbe/TF-MP10_Bruce Archambeault-4855.pdf · DesignCon 2010 Enabling Early Design of Complex Power Delivery

References [1] Engin, K. Bharath, M. Swaminathan, M. Cases, B. Mutnury, N. Pham, D. de Araujo and E. Matoglu,

“Finite-difference modeling of noise coupling between power/ground planes in multilayered packages and boards”, in Proceedings of Electronic Components and Technology Conference (ECTC), pp. 1262–1267, 2006.

[2] A. Engin, K. Bharath, and M. Swaminathan, “Multilayered finite difference method (MFDM) for modeling of package and printed circuit board planes,” IEEE Transactions on Electromagnetic Compatibility, vol. 49, no. 2, p. 441, 2007.

[3] S. Rao, D. Wilton, and A. Glisson, “Electromagnetic scattering by surfaces of arbitrary Shape,” IEEE Transactions on Antennas and Propagation, vol. 30 no. 3, pp. 409–418, 1982.

[4] J.M. Jin, The finite element method in electromagnetics, Wiley New York, 1993. [5] K.S. Yee, “Numerical solution of initial boundary value problems involving Maxwells equations in

isotropic media,” IEEE Transactions on Antennas and Propagation, vol. 14, no. 3, pp. 302–307, 1966.

[6] A. E. Ruehli, “Equivalent Circuit Models for Three-Dimensional Multiconductor Systems,” IEEE Transactions on Microwave Theory and Techniques, vol. 22, no. 3, pp. 216–221, 1974.

[7] T. Okoshi and T. Miyoshi, “The Planar Circuit–An Approach to Microwave Integrated Circuitry,” IEEE Transactions on Microwave Theory and Techniques, vol. 20, no. 4, pp. 245–252, 1972.

[8] N. Na, J. Choi, S. Chun, M. Swaminathan, and J. Srinivasan, “Modeling and transient simulation of planes in electronic packages,” IEEE Transactions on Advanced Packaging, vol. 23, no. 3, pp. 340–352, 2000.

[9] H. Wu, J. Meyer, K. Lee, and A. Barber, “Accurate power supply and ground plane pair models,” IEEE Transactions on Advanced Packaging, vol. 22, no. 3, pp. 259–266, 1999.

[10] O. Ramahi, V. Subramanian, and B. Archambeault, “A simple finite difference frequency-domain (FDFD) algorithm for analysis of switching noise in printed circuit boards and packages,” IEEE Transactions on Advanced Packaging, vol. 26, no. 2, pp. 191–198, 2003.

[11] M. Zhao, R. Panda, S. Sapatnekar and D. Blaauw, “Hierarchical analysis of power distribution networks,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 21, no. 2, pp. 159–168, 2002.

[12] Q. Zhou, K. Sun, K. Mohanram, and D. C. Sorensen, “Large power grid analysis using domain decomposition,” in Proceedings of Design, Automation and Test in Europe, vol. 1, 2006, pp. 1–6.

[13] L. Smith, R. Anderson, D. Forehand, T. Pelc, T. Roy, S. Inc, and P. Alto, “Power distribution system design methodology and capacitor selection for modern CMOS technology,” IEEE Transactions on Advanced Packaging, vol. 22, no. 3, pp. 284–291, 1999.

[14] J. Knighten, B. Archambeault, J. Fan, G. Selli, L. Xue, S. Connor, and J. Drewniak, “PDN Design Strategies: II. Ceramic SMT Decoupling Capacitors–Does Location Matter?” IEEE EMC Society Newsletter, vol. 208, pp. 56–67, 2006.

[15] H. Zheng, B. Krauter, and L. Pileggi, “On-Package Decoupling Optimization with Package Macromodels,” In Proc. of Custom Integrated Circuits Conference, pages 723–726, 2003.

[16] J. Chen and L. He, “Efficient In-Package Decoupling Capacitor Optimization for I/O Power Integrity,” IEEE Transactions on Computer-aided Design, vol. 26, no. 4, pp. 734–738, April 2007.

[17] F. Zhang, The Schur complement and its applications. Springer, 2005. [18] T.A. Manteuffel and A.B. White Jr., “The numerical solution of second-order boundary value

problems on nonuniform meshes,” Mathematics of Computation, pp. 511–535,1986. [19] O. Schenk and K. Gärtner, “Solving unsymmetric sparse systems of linear equations with

PARDISO,” Future Generation Computer Systems, vol. 20, no. 3, pp. 475–487, 2004. [20] I. Novak, J. Miller and E. Blomberg, “Simulating complex power-ground plane shapes with variable-

size cell SPICE grids,” In Proc. of IEEE Electrical Performance of Electronic Packaging, pages 265–268, 2002.

EDA365