Download ppt - Solving [Specific Classes of] Linear Equations using Random Walks Haifeng Qian Sachin Sapatnekar

Solving [Specific Classes of] Linear Equations using Random Walks

Solving [Specific Classes of] Linear Equations using Random Walks

Haifeng Qian

Sachin Sapatnekar

Definition of diagonal dominanceDefinition of diagonal dominance

Matrix A is diagonally dominant if

aii ij|aij|

Consider a system of linear equations

Can solve A x = b rapidly using a random walk analogy if A is diagonally dominant (but not singular, of course!)

Row iaii

x1

x2

:

xn

b1

b2

:

bn

=(abbreviated as A x = b)

Why do I care?Why do I care?

Diagonally dominant systems arise in several contexts in CAD (and in other fields) – for example, Power grid analysis VLSI Placement ESD analysis Thermal analysis FEM/FDM analyses Potential theory

The idea for random walk-based linear solvers has been around even in the popular literature R. Hersh and R. J. Griego, “Brownian motion and potential

theory,” Scientific American, pp. 67-74, March 1969. P. G. Doyle and J. L. Snell, Random walks and electrical

networks, Mathematical Association of America, Washington DC, 1984.http://math.dartmouth.edu/~doyle/docs/walks/walks.pdf

Dirichlet’s problemDirichlet’s problem

Dirichlet problem: an example – thermal analysis Given a body of arbitrary shape, and complete information about

temperature on the surface: find termperature at an internal point Temperature is a harmonic function: temperature at a point

depends on average temperature of surrounding points

Shizuo Kakutani (1944) – solution of Dirichlet problem Brownian motion starting from a point (say, a)

Take an award T = temperature of first boundary point hit Find E[T]

ab

A Direct SolverA Direct Solver

Stochastic solversStochastic solvers

Random event(s)Equations

Expectation of random variablesVariables

Average of random samples

Approximate solution

Stochastic solver methodology

Example: Power grid analysisExample: Power grid analysis

Network of resistors and current sources

The equation at node x is

xxxxx IgVVgVVgVVgVV 44332211

g

IVg

g

Vg

gVg

gVg

gV

x

x

44

33

22

11

Alternatively

x 3

2

1

4

g1

g2

g3

g4Ix

Mapping this to the random walk “game”Mapping this to the random walk “game”

Solving a grid amounts to solving, at each node:

or

g

IV

g

gVg

gVg

gVg

gV xx degree(x)

degree(x)3

32

21

1

0orDDpad VV ∑ 1

Given: A network of roads A motel at each

intersection A set of homes

Random walk Walk one (randomly

chosen) road every day Stay the night at a motel

(pay for it!) Keep going until reaching home Win a reward for reaching home!

Home

Home

Home

$

]nodefromendtheatpocketinmoney[)( xExf

Problem: find the expected amount of money in the end as a function of the starting node x

Random walk overviewRandom walk overview

For every node

award)home(

or

motel)4()3()2()1()( 4,3,2,1,

f

fpfpfpfpxf xxxxx

31 x

px,1 px,3

px,2

px,4

2

4



Random walk game

Linear equation set

M walks from i-th node

Take average

This yields xi

ExampleExample

Solution: xA=0.6 xB=0.8

xC=0.7 xD=0.9

A

B

C

D

0.5 0.5

0.670.44

0.44

0.12

0.2

0.33

0.8$1 $1

$1

-$0.2

-$0.022

-$ 0.05

-$0.04

76.02.0

022.012.044.044.0

45.05.0

13.067.0

CD

DBAC

CB

CA

xx

xxxx

xx

xx

A

B

C

D

0.5 0.5

0.670.44

0.44

0.12

0.2

0.33

0.8$0 $0

$0

+$0.13

-$0.022

+$ 0.45

+$0.76

Play the gamePlay the game

Walk results:

A

B

C

D

0.5 0.5

0.670.44

0.44

0.12

0.2

0.33

0.8

$1

$1$1

-$0.2

-$0.022-$0.05 -$0.04

Pocket: $0-$0.2-$0.222-$0.272$0.728

$0.728

-$0.516-$0.556$0-$0.2$0.8

$0.8

$0-$0.2-$0.222-$0.422-$0.444-$0.494$0.444

$0.444

……

0.6

Average

$0-$0.2-$0.222-$0.422$0.578

$0.578

$0-$0.2-$0.222-$0.262-$0.284-$0.484-$0.506-$0.556-$0.578-$0.618$0.382

$0.382

$0-$0.2-$0.222-$0.262$0.738

$0.738

Reusing computations: avoiding repeated walks(When solving for all xi values)

Reusing computations: avoiding repeated walks(When solving for all xi values)

Qian et al., DAC2003

Benefit: more and more homes

shorter and shorter walks

one walk = average of multiple walks

Start Previously calculated node

]gain[Ef

New home]gain[Award Ef

Start

Reusing computations: “Journey record”(when solving for multiple right hand sides)

Reusing computations: “Journey record”(when solving for multiple right hand sides)

Qian et al., DAC2003

Update motel prices, award values

Use the record: pay motels, receive awards

New solution

Keep a record: motel/award list

New RHS: Ax = b2

Benefit: no more walks

only feasible after trick#1

Error vs. runtime tradeoffsError vs. runtime tradeoffs

Industrial circuit 70729 nodes, 31501 bottom-layer nodes VDD net true voltage range 1.1324—1.1917

0

2

4

6

8

10

12

14

16

18

20

0.00 10.00 20.00 30.00 40.00 50.00 60.00

RUNTIME (second)

ER

RO

R M

AR

GIN

(m

V)


Advantage Locality: solve single entry

Weakness Error ~ M-0.5

3% error to be faster than direct/iterative solvers

Example application: Power grid analysisExample application: Power grid analysis

Exact DC analysis: Solve G X = E

Very expensive to solve for millions of nodes, eventually prohibitive

Simple observations: VDD and GND pins all over chip surface (C4

connections) Most current drawn from “nearby” connections

A preconditioned iterative solverA preconditioned iterative solver

compute tocheapproduct vector -,1s.t. TTAI

TTAA

bxbx

Current techniquesCurrent techniques

Preconditioning

Popular choice: Incomplete LU

Placement/thermal matrices Symmetric positive definite Popular choice: ICCG with

Different ordering Different dropping rules

1'' , '' ULTAUL

Current techniquesCurrent techniques

*

*

**

*

*

** *

*

*

*

***

* *

* ** *

done

done Rules

Pattern Min value Size limit

Stochastic preconditioningStochastic preconditioning

The Hybrid

Solver

Stochastic

Solver

Iterative

Solver

adopt an iterative scheme

stochastic precondi-

tioning

Special case today Symmetric Positive diagonal entries Negative off-diagonal entries Irreducibly diagonally dominant These are sufficient, NOT necessary, conditions

Sequential Monte CarloSequential Monte Carlo

A. W. Marshall 1956, J. H. Halton 1962

bx AStochastic

Solver x'

x'bx'x AAA x'bx'x AA )(

ry AStochastic

Solver y'

New solution: y'x'

Benefit: ||r||2<<||b||2 ||y||2<<||x||2 same relative error = lower absolute error

approx. solve

approx. solve

error & residual

This is computationally easy!This is computationally easy!

bx AStochastic

Solver x'

x'bx'x AAA x'bx'x AA )(

ry AStochastic

Solver y'

New solution: y'x'

Update motel prices, award values

Use the record: pay motels, receive awards

New solution

Keep a record: motel/award list

New RHS: Ax = b2

Can show that this amounts to preconditioned Gauss-Jacobi

So - why G-J? Why not CG/BiCG/MINRES/GMRES

What’s on the journey record?What’s on the journey record?kry ASolver using

the record kk ry T

NN r

r

r

y

y

y

2

1

2

1

*0000

**000

***00

***0

****

1***

01**

001**

0001*

00001

kk ry ZQ

Row i

i

jiQi,j from walks#

to walks#

iii,j Ai

j

, from walks#

motelat nights #Z

These are “UL” factors.

Relation to LU factors? Just a matrix reordering!

LDL factorizationLDL factorization

What we need for symmetric A

What we have

How to find ? Easy – details omitted here

Easy extension to asymmetric matrices exists

Trevrevrevrevrevrev AAAAA LDLULA

A

A

UZ

LQ

rev

T1

revT

rev

rev

i

jiQi,j from walks#

to walks#

iii,j Ai

j

, from walks#

motelat nights #Z

Recall

ADrev

Compare to existing ILUCompare to existing ILU

Existing ILU Gaussian elimination Drop edges by pattern, value, size

Error propagation A missing edge affects subsequent computation Exacerbated for larger and denser matrices

a

b1b2

b3

b4

b5

b1b2

b3

b4

b5

Superior because …Superior because …

Each row of L is independently calculated No knowledge of other rows Responsible for its own accuracy No debt from other steps

Test setupTest setup

Quadratic placement instances Set #1: matrices and rhs’s by an industrial placer Set #2: matrices by UWaterloo placer on ISPD02

benchmarks, unit rhs’s

LASPACK: ICCG with ILU(0)

MATLAB: ICCG with ILUT Approx. Min. Degree ordering Tuned to similar factorization size

Same accuracy:

Complexity metric: # double-precision multiplications Solving stage only

6

2

2 10

b

xb A

ComparisonComparisonMatrix m4 m5 ibm17 ibm18

Dimension 4.6e5 8.8e5 2.5e5 2.7e5

# Entries 8.2e6 9.4e6 1.8e6 1.7e6

ICCG with ILU(0)

mul. per iter. 1.9e7 2.3e7 4.8e6 4.8e6

# iteration 110 159 137 191

total # mul. 2.1e9 3.7e9 6.6e8 9.1e8

ICCG with ILUT

mul. per iter. 2.5e7 3.9e7 8.5e6 8.0e6

# iteration 55 82 47 65

total # mul. 1.4e9 3.2e9 4.0e8 5.2e8

Hybrid

RW+CG

mul. per iter. 2.5e7 3.7e7 8.6e6 8.5e6

# iteration 13 12 20 23

total # mul. 3.3e8 4.4e8 1.7e8 2.0e8

Speedup over ICCG-ILU(0) 6.3 8.4 3.8 4.7

Speedup over ICCG-ILUT 4.2 7.1 2.3 2.7

Physical runtimes on P4-2.8GHzPhysical runtimes on P4-2.8GHz

Ckt Dimension # Entry Precond. runtime (s)

Solving runtime (s)

m1 2.7e5 3.2e6 20.77 6.22

m2 4.3e5 5.2e6 33.00 11.90

m3 3.5e5 5.5e6 21.67 9.73

m4 4.6e5 8.2e6 46.91 17.07

m5 8.8e5 9.4e6 68.90 26.09

Reasonable preconditioning overhead Less than 3X solving time One-time cost, amortized over multiple solves

Scaling trendScaling trend

Newer resultsNewer results

Examples generated from Sparskit Finite difference discretization of 3D Laplace’s

equation under Dirichlet boundary conditions

S: #nonzeros in preconditioner (similar in all cases) C1: Condition # of original matrix C2: Condition # after split preconditioning

Ex1Ex2Ex3Ex4Ex5

Ex6

ConclusionConclusion

Direct solver Locality property Reasonable for approximate solutions

Hybrid solver Stochastically preconditioned iterative solver Independent row/column estimates for LU factors Better quality than same-size traditional ILU

Extension to non-diagonally dominant matrices? Some pointers exist (see Haifeng Qian’s thesis) Needs further work…

Thank you!

Downloads Solver package

http://www.ece.umn.edu/users/qianhf/hybridsolver Thesis

http://www-mount.ece.umn.edu/~sachin/Theses/HaifengQian.pdf

Thanks also (and especially) to…Thanks also (and especially) to…

Haifeng Qian

(He will pick up the 2006 ACM Oustanding Dissertation Award in Electronic Design Automation in San Diego next week)