[IEEE TENCON'94 - 1994 IEEE Region 10's 9th Annual International Conference on: 'Frontiers of Computer Technology' - Singapore (22-26 Aug. 1994)] Proceedings of TENCON'94 - 1994 IEEE

A New Matrix Solution Technique for Large-scale Circuit Simulation on Multi-processor System

X D Jia and RMMChen Department of Electronic Engineering

City Polytechnic of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong

Abstract In this paper a new matrix solution technique is

proposed for large-scale circuit simulation. This technique combines direct and iterative relaxation- based methods and is very suitable for parallel processing. In order to make our technique robust or generally applicable, a heuristic method to identifjt the tight couplings and strong feedbacks among the sub- circuits is proposed. The identification process is done by setting a tolerance for each equation. The tolerance computation only requires forward and backward substitution and can be done in parallel. A simple example to illustrate this technique is also given in this paper.

I. Introduction

Circuit simulation is one of the most important CAD tools for IC designs. It is also a very time-consuming task. In the early days of circuit simulation, direct matrix solution, the most commonly used approach[ 11, was acceptable since it represented a relatively small part of the total simulation time. With the size of today's VLSI circuits, this is not always true. Due to the direct matrix solution's super linear growth rate, some large circuits require 90% or more of the simulation time for matrix solution.

The new generation of parallel processors offers an alternative for high performance circuit simulation. Much effort has been devoted to develop direct circuit simulation algorithms for parallel processing [2,3,4] since last decade. But for direct method the large number of sub-circuits result in a large interconnection level matrix equation to be solved at the serial phase and thus both the speed-up factor and efficiency of the system are limited. Until now the most encouraging result already published may be the speedup of 16 achieved on a 64 processor system [3].

Relaxation-based methods avoid the direct solution of large sparse matrix equation and are relatively easier

to be implemented on parallel processors [2]. Their execution times typically grow much more slowly as circuit size increases. Unfortunately, they cannot be guaranteed to converge for all circuit matrices and frequently converge slowly. This convergence problem has limited their use in practice. Recently we proposed a technique to improve the convergency speed of relaxation-based circuit simulations on tightly coupled circuits [5]. In this paper, we further develop the technique to be a new general matrix solution technique. In order to make our technique robust or generally applicable, a heuristic method to idenufy the tight couplings and strong feedbacks among the sub- circuits is proposed. The identification process is done by setting a tolerance for each equation. The tolerance computation only require forward and backward substitution and can be done in parallel. This new matrix solution technique combines direct and iterative relaxation-based methods and is very suitable for parallel processing.

II. Background Information

Let the linear or linearized circuit equation be

A_x=y - (1) where A is an nxn circuit equation matrix and x is the unknown n-vector to be solved for a given n-vector y.

This equation may be solved by a direct method using LU factorization together with a sparse matrix technique[l]. But the solution time T grows exponentially with the circuit size n according to

T = n a , where a ~ ( 1 . 2 - 2 ) A typical relaxation-based method splits A into the

sum of two matrices: (2)

The k-th iterative equation based on a given splitting is then

A = M - N

(3)

I I

832

__

For Jacobi and Gauss-Seidel relaxation methods, M is diagonal and lower triangular respectively.

The rate of convergence of any general iterative method to solve (3), depends on the spectral radius of M-lN defined by[8,9]:

p ( K ' N ) = max[(ll: l E(M-'N)] (4)

where the notation 1 E( M-IN) represents the set of eigenvalues of W N . The l i t of the average rate of convergence is the asymptotic rate of convergence, %,

lim R(B) = %, (B) = - In(p(B)), (5 )

where B = M-'N [9]. Computation of the spectral radius p(B) can be very time consuming for large sizes of B. However, it can be shown [9] that p ( B ) ~ ~ ~ B l ~ . One valid choice for IlBll is the row sum norm. Thus,

k-tm

All AI2 ... AI, - Ail Ai , ... -42,

. . . . . .

. . . . . . A =

. . . . . . - 4, 4 2 ... A , -

The row sum norm can be also used as a sufficient condition for convergence, if JIBJJ<l, the iterative method will converge for the particular splitting of A into M and N.

It is observed that the form of M obtained from splitting A is important in order to improve the rate of convergence of the processes. M should be in such a form that @+')is easy to be solved from

(8)

For general ' circuit, point-wise Gauss-Seidel and Gauss-Jacobi will converge if A is strictly or irreducibly diagonally dominant, and most circuit matrices are of this type. U n f ~ r t u ~ t d y their convergence is often slow, since the spectral radii of their iteration matrices are frequently close to one. Block Gauss-Seidel relaxation and block Gauss-Jacobi relaxation are approaches to solve this problem. In these methods, the circuit is partitioned into a large number of sub-circuits by idenming groups of strongly coupled nodes. The relaxation methods are used for inter-sub-circuit level and direct method is employed for sub-circuit solution. These methods work well for MOS circuits. But for circuits which contain tightly coupled nodes and/or strong feedbacks they still converge slowly or result in large size of sub-circuits such that same shortcoming of direct method exists (2,101.

IU. New Technique

In this paper we employ the partitioning algorithm [46] for the solution of (7) so that it allows M to contain large and remote offdiagonal elements which represent the tight coupling and strong feedback among sub-circuits. These elements may cause p to be very large if they are conwined in N. How to select

833

- x i = M l , - ' ( g ' - D i g l ) - (11) where all items on right hand side of (11) are known except zi. Define an m-vector z whose elements consist of all the elements of zl, for i = 1, 2, ..., r, without any duplication in terms of the elements of E. It was shown [6] that

where the elements of H matrix and determined from

g = ( I , - H ) - l (12) vector can be

(13) E' = - M , - I D' and

- v' = M..-' " - g' (14) respectively [6] . Equation (12) is called the interconnection level equation. Once z is calculated from (12), all sub-circuit solutions can be computed in parallel from (1 1) to form ~ ( ~ + l ) which is the solution of (9) at the (k+1) th iteration of the relaxation process. This solution will be used as known vector in (7) again to solve for new @+l).

Since the inter-connection level equation (12) and all sub-circuit equations (11) are solved by using LU factorization, in the k-th (where k>l) iteration of the relaxation process only fonvard and backward substitutions are required.

Since the effect of tight coupling or strong feedback to the i-th sub-circuit from the nodes,outside of the sub- circuit is removed by solving z' from (12), the convergency speed of the relaxation process can be much improved. But how to identify the off-diagonal elements, which should be contained in M matrix, is very important for our new method. If all the offdiagonal elements of A are contained in M, our technique becomes the direct method using partitioning without relaxation. If M matrix does not contain any offdiagonal element, our method is the same as the normal block Jacobi relaxation.

-011 012 0 0 a,, 0

a,, a,, 0 0 0 0

0 0 a,, a, 0 0

0 0 a% 0 0 M =

0 0 0 0 a,, a, -

IV. Identification of Tight Couplings and Large Feedbacks among Sub-circuits

Theoretically if an off-diagonal element remaining in N causes a large spectral radius p(B), then this element represents a tight coupling or strong feedback and should be removed from N to M. However, it is very difficult to calculate spectral radii. The row sum norm, which is easy to be computed, can be used as a sufficient condition for convergence. Therefore a heuristic procedure based on the similar ruie as [7] can be derived to idenhfy tight couplings and strong feedbacks among subcircuits. This identification process is done by setting a tolerance T for each equation. Any off diagonal element in the

'13 a14 '16

$3 O24 .'25 %6

O3l '3, a35 '36

'41 '42 '45 N =

'53 a54

a61 a64 -

16156

e 3 5 0 0 0 0 0

0 0 0 0 0

and the row sum norm IlBll is I(maxle3, I)a,,I. IfTN is a row sum norm value which we expect to achieve fast

convergence rate and [(maxle,, I)a3,j < TN, then %, will not cause a larger row sum norm than we have expected and should remain in M. Otherwise it should be moved from N to M. The vector 5 can be calculated by solving the equation

whereS,=[O,O, 1,0,0,OIT. vector is

obtained and thus the tolerance for the third equation T, is defined as:

1-56

Me3 - = 2 , (15)

Similarly, for %, ,%, ,%6 , the same

T3=Td(Ele3. t 1). (16)

834

In general, for the p-th equation we have

Tp=TN/(maxle,,,I) and Me, = sp where is a vector

whose p-th element is one and all the other elements are zero.

Although this is for a simple example of a 6x6 matrix, in fact, this can be generalized to any nxn matrix,

Isis6 - -

A =

and

1.0040 -0.001 0 -1 0 0

-0.1010 0.1100 -0.001 0 0 4.001 0 -0,001 1.003 -0.0101 0 0 -I 0 -0.01 1.0101 -0.01 0

0 0 0 -0.01 1.01 -1

L 0 0 0 0 -1.0 1.01

Me, = sp - -

It should be noted that these formulas are derived from the consideration of only one offdiagonal element. The actual circuit matrix may have more than one off diagonal element and this may make the rule more complicated. However, conservative selection of TN, say 0.025, prevents this from being a problem[7].

When deriving (17)--(18), M is assumed to be known. For circuit simulation the M matrix created during the previous Newton iteration may be used as M for computing the tolerance. . Since circuit simulation truncation error can be set a limit as the safeguard to prevent drastic changes of circuit matrix between time points, and any significant changes will require many Newton iterations for the simulation to converge. Therefor the LU factorization results for all subcircuit matrixes and the (Im-J$ matrix in (12) during previous Newton iteration can be used to calculate the e from (18) and only forward and backward subdtution operations are needed. The tolerance computation can also be done in parallel. Moreover, since the right hand side of (18) only contains one nonzero element, for some sub-circuits their right hand side elements are all zeros and the substitution operations can be skipped. For the equations correspond to internal nodes, the computation of T, , can also be skipped since there is no offdiagonal element in these equations. We need only to set Tp I times, where I is the number of interconnecting nodes of the circuit.

V. The Procedure

The main procedure of our matrix solution technique is described as follows:

/**SOLVING LrmEARIZEDMATRIX AT NEWTON ITERATION t=l,+t**/

for csch nnvcm offdiagonal element a.., I5i5n and I sj5n /**parallel**/ ifla..[ > Ti. move a.. to M matrix, else rrsnsin a.. in M matrix

end for.

form Mii, d, D' for each sub-circuit I** parallcl**/

set X'%Ok)

LUfactoriration forMi, IG5n. end for. form H hough computing E' LU factorization for (I-H)

while Convergenceflag = falw set g'"'=N~'o'+y and form through computing V' /** parallel**/ foward and backward substitution for for each sub-circuit /*I parallcl**/

I** parallel**/

c0"Vngenceflag c falw

to obtain z

foward and backward rubstltution for LMi UMi to obtam x'. check m v q e n c y

end for ifall sub-circuits converge. Convergenceflag c m e p) = - x

end vvhilc for each equation say the pth equatioh lrpln I** parallcl**/

ifthis equation correspondmg to a connecting node foms and zi

foward and backward substitution for Ll.,, U,." fa obtain zc for each sub-circuit i, Isisr.

form q through cWnputmgvci

if 8' has n o n m element foward and backward substitution forLMii U=i to obtain c'. end if

end for select ( m l e p ~ l l comwt%%ance T bv

end for / **next Newton iteration**/

and

N =

0 0 0 1 0 0 0 0 0.001 0 0 0.001 0 0.001 0 0 0 0 1 0 0 0 0.01 0 0 0 0 0.01 0 0 0 0 0 0 0 0

835

Since there are some tight couplings and strong feedbacks among blocks, the iteration converges very slowly and the spectral radius is 0.998.

By comparing with the diagonal elements, a,,4 and a4,1 are identified as strong feedback but a4,5 and are not. Therefore and a4.1 are moved from N to M. Although the convergence rate is improved, but it is still slow. The corresponding row sum norm and the spectral radius are 0.9092 and 0.8347 respectively. This is because a4,5 and are also critical for the splitting. Following the identification procedure, we

choose TN to be 0.1, the results of (maxle,,lI)and

(maxle4,11) are 90.91 and 76.51 respectively from

(18) and the results of T, and T4 are 0.0011 and 0.00 13 respectively from( 17). Since la5,,l=la4,,l=0.0 1, then >T, and la4,,I>T4, a4,, and asp are contained in M based on the technique. The splitting using the technique is

IS/<"

Il l<"

1.0040 -0.001 0 -1 0 0 -0.1010 0.1100 0 0 0 0

0 0 1.003 4 . 0 1 0 1 0 0 -1 0 -0.01 1.0101 -0.01 0 0 0 0 -0.01 1.01 -1 0 0 0 0 -1.0 1.01

M =

and

N =

0 0 0 0 0 0 0 0 0.001 0 0 0.001 0 0.001 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The spectral radius is 0.0083 and the iteration converges in three iterations.

VI. Conclusion

A new matrix solution technique for circuit simulation on multi-processor system is proposed. This technique combines direct method with relaxation-based method to overcome the shortcomings of these two methods. A heuristic procedure to identlfy tight couplings and strong feedbacks among sub- circuits is also derived. This procedure may guarantee that the row sum norm of the splitting used in the technique is small enough in order to achieve a fast convergence rate. The identification process is somewhat straightforward and can be performed in parallel. The simple numerical example demonstrates the effectiveness of the identification procedure employed in this technique.

Acknowledgment The authors are grateful to the University and

Polytechnic Grants Committee of Hong Kong for the financial support of this research project.

References L. W. Nagel, "SPICE2: A computer program to simulate semiconductor circuits," Univ. California, Berkeley, ERL. Memo ERL-M520, May 1975.

R. A. Saleh et al," Parallel Circuit Simulation on Supercomputers," Prw lEEE, Vo1.77,No.12, pp. 1915- 1931, Dec.1989.

T Nakata, et al.," CENJU: A Multiprocessor System for Modular Circuit Simulation," Computing Systems inEngineering, Vol.l,No.l, pp.101-109, 1990.

R. M. M. Chen, et al.," Running SPICE in Parallel," Proc. 199 1 IEEE Intemational Symposium on Circuits and Systems, Singapore, pp.880-883, June 1991.

R. M. M. Chen and X. D. Jia, " A Technique to Improve the Convergency Speed of Relaxation-based Circuit Simulation on Tightly Coupled Circuits," Roc. 1993 IEEE Intemational Symposium on Circuits and Systems, Chicago, May 1993.

R. M. M. Chen, " Solving a Class of Large Sparse Linear Systems of Equations by Partitioning," Proc. 1973 lEEE International Symposium on Circuit Theory, Toronto, Canada, pp.223-226, April 1973.

R. Burch et al., " A New Matrix Solution Technique for General Circuit Simulation," IEEE Trans Computer-Aided Design, Vo1.12, No.2, pp.225-241, February 1993.

G. H. Golub and C. F. Van Loan, Matrix Computation, 2nd Edition, Johns Hopkins University Press, 1989.

R. S. Varga, Matrix Iterative Analysis, Englewood Cliffs, NJ: Pretice-Hall, 1962.

M. l'. Desai and I. N. Hijj,"On the convergence of block relaxation methods for circuit simulation," IEEE Trans on Circuits and Systems, Vol. 36, No.cAD-7, pp948-958, July 1989.

A. Lumsdaine, et al.," A Band Relaxation Algorithm for Reliable and Parallelizable Circuit Simulation," Proc ICCAD-88, pp.408411,November 1989.

836

Documents

[IEEE TENCON'94 - 1994 IEEE Region 10's 9th Annual International Conference on: 'Frontiers of Computer Technology' - Singapore (22-26 Aug. 1994)] Proceedings of TENCON'94 - 1994 IEEE