42
1. Computer Science & Engineering Dept., University of California, San Diego, CA 2. Facebook Inc., Menlo Park, CA MATEX: A Distributed Framework of Transient Simulation for Power Distribution Networks * Email: [email protected] Hao Zhuang 1* , Shih-Hung Weng 2 , Jeng-Hau Lin 1 , Chung-Kuan Cheng 1

DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Embed Size (px)

DESCRIPTION

This is the draft slides we use for DAC 2014 presentation. Abstract: We proposed MATEX, a distributed framework for transient simulation of power distribution networks (PDNs). MATEX utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit. First, the whole simulation task is divided into subtasks based on decompositions of current sources, in order to reduce the computational overheads. Then these subtasks are distributed to different computing nodes and processed in parallel. Within each node, after the matrix factorization at the beginning of simulation, the adaptive time stepping solver is performed without extra matrix re-factorizations. MATEX overcomes the stiffness hinder of previous matrix exponential-based circuit simulator by rational Krylov subspace method, which leads to larger step sizes with smaller dimensions of Krylov subspace bases and highly accelerates the whole computation. MATEX outperforms both traditional fixed and adaptive time stepping methods, e.g., achieving around 13X over the trapezoidal framework with fixed time step for the IBM power grid benchmarks.

Citation preview

Page 1: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

1. Computer Science & Engineering Dept., University of California, San Diego, CA

2. Facebook Inc., Menlo Park, CA

MATEX: A Distributed Framework of Transient Simulation for Power Distribution Networks

* Email: [email protected]

Hao Zhuang1*, Shih-Hung Weng2, Jeng-Hau Lin1, Chung-Kuan Cheng1

Page 2: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Outline Problem Formulation

MATEX Framework Circuit Solver

Matrix Exponential KernelKrylov Subspace Accelerations for PDNs

Distributed Framework Linear system’s Superposition Property and

Parallel Processing Reduce Krylov Subspace Computations

Experimental Results

Conclusions 2

Page 3: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Linear differential equations𝐂𝐂�̇�𝐱 𝑑𝑑 = βˆ’π†π†π±π±(𝑑𝑑) + 𝐁𝐁𝐁𝐁(𝑑𝑑)

Tens of millions or billions unknowns

Problem Formulation for PDN Transient Simulation

𝐂𝐂: capacitance/inductance matrix𝐆𝐆: conductance matrix𝐱𝐱(𝑑𝑑): voltage/current vector𝐁𝐁: input selection matrix𝐁𝐁 𝑑𝑑 : input current sources (vector)

PDN structureRLC model

3

Page 4: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Previous Work

Time step size β„Ž is determined by Input transition distances defines

the upper bound of the time step, e.g. β„Ž2 = min(β„Ž1,β„Ž2,β„Ž3)

Stiffness of systems Local truncation error (LTE) β„Ž1

β„Ž2β„Ž3

A pulse input example

Low order approximations, e.g. Trapezoidal method (TR) π‚π‚β„Ž

+ 𝐆𝐆2𝐱𝐱 𝑑𝑑 + β„Ž = 𝐂𝐂

β„Žβˆ’ 𝐆𝐆

2𝐱𝐱 𝑑𝑑 + 𝐁𝐁𝐁𝐁 𝑑𝑑+β„Ž +𝐁𝐁(𝑑𝑑)

2

TR with fixed time-step β„Ž was used by the top solvers in TAU’12 power grid (PG) simulation contest

Efficient for IBM PG Benchmarks Only one matrix factorization for transient stepping

Process forward and backward substitutions to calculate 𝐱𝐱 𝑑𝑑 + β„Ž

4

Page 5: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Our Matrix Exponential Method Analytical solution [Weng, et. al., IEEE TCAD 2012]

𝐱𝐱 𝑑𝑑 + β„Ž = π‘’π‘’β„Žπ€π€π±π±(𝑑𝑑) + οΏ½0

β„Žπ‘’π‘’(β„Žβˆ’πœπœ)𝐀𝐀𝐛𝐛(𝑑𝑑 + 𝜏𝜏) π‘‘π‘‘πœπœ

where 𝐀𝐀 = βˆ’π‚π‚βˆ’πŸπŸπ†π†,𝐛𝐛 = π‚π‚βˆ’πŸπŸππππ(𝐭𝐭)

Input sources are piecewise linear (PWL)

𝐱𝐱 𝑑𝑑 + β„Ž = π‘’π‘’β„Žπ€π€ (𝐱𝐱 𝑑𝑑 + 𝐅𝐅 𝑑𝑑, β„Ž) βˆ’ 𝐏𝐏 𝑑𝑑, β„Ž

Where

𝐅𝐅 𝑑𝑑,β„Ž = π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + π€π€βˆ’πŸπŸ 𝐛𝐛 𝑑𝑑+β„Ž βˆ’π›π› π‘‘π‘‘β„Ž

,

𝐏𝐏 𝑑𝑑, β„Ž = π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + β„Ž + π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + β„Ž βˆ’ 𝐛𝐛 𝑑𝑑

β„Ž

vectorMatrix exponential vector

5

Page 6: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Advantage in Accuracy

Reference solution

With the same h, Matrix Exponential method can reaches the reference solution, while Backward Euler cannot.

6

Page 7: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Not 𝒆𝒆𝐀𝐀, but 𝒆𝒆𝐀𝐀𝐯𝐯 [Weng, et. al. IEEE TCAD 2012]

Compute 𝒆𝒆𝐀𝐀 is very expensive, when 𝐀𝐀 is large!

𝒆𝒆𝐀𝐀𝐯𝐯: Matrix Exponential and Vector Product (MEVP) Efficiently approximated via Krylov subspace (MEXP)

Standard Krylov subspace π‘²π‘²π’Žπ’Ž 𝐀𝐀, 𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π―

Basis Generation: π•π•π’Žπ’Ž = 𝐯𝐯𝟏𝟏, 𝐯𝐯𝟐𝟐,β‹― , π―π―π’Žπ’Ž

Arnoldi process and Matrix reduction:π€π€π•π•π’Žπ’Ž = π•π•π’Žπ’Žπ‡π‡π’Žπ’Ž + π’‰π’‰π’Žπ’Ž+𝟏𝟏,π’Žπ’Žπ―π―π’Žπ’Ž+πŸπŸπ’†π’†π’Žπ’Žπ“π“

MEVP is computed by

𝒆𝒆𝐀𝐀𝐯𝐯 β‰ˆ 𝐯𝐯 πŸπŸπ•π•π’Žπ’Ž π’†π’†π‡π‡π’Žπ’Žπ’†π’†πŸπŸ Time stepping only by scaling h,

π’†π’†β„Žπ€π€π―π― β‰ˆ 𝐯𝐯 πŸπŸπ•π•π’Žπ’Ž π’†π’†β„Žπ‡π‡π’Žπ’Žπ’†π’†πŸπŸ

7

Page 8: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Algorithm of Computing 𝐱𝐱(𝑑𝑑 + β„Ž)

PDN is a linear system, so that the input matrices π—π—πŸπŸ,𝐋𝐋,𝐔𝐔 do not change. π₯π₯𝐁𝐁_𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝 π—π—πŸπŸ is done only once for the whole simulation.

π—π—πŸπŸ π—π—πŸπŸ 𝐋𝐋,𝐔𝐔MEXP 𝐂𝐂 𝐆𝐆 π₯π₯𝐁𝐁_𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝 (π—π—πŸπŸ)

8

Page 9: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

PDNs are usually highly stiff circuits Generalized eigenvalues spread in a wide range

within spectrum of A. (𝐀𝐀 = βˆ’π‚π‚βˆ’πŸπŸπ†π†) Requires Standard Krylov subspace to build a very

large number of bases to approximate MEVP.

Problem #1: Stiff PDN Circuits

9

Page 10: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Next Section Problem Formulation

MATEX Framework Circuit Solver

Matrix Exponential KernelKrylov Subspace Accelerations for PDNs

Distributed Framework Linear system’s Superposition Property and

Parallel Processing Reduce Krylov Subspace Computations

Experimental Results

Conclusions 10

Page 11: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Standard Krylov subspace (MEXP) (a) Standard Krylov Basis (MEXP):

π‘²π‘²π’Žπ’Ž 𝐀𝐀,𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π―

Im

Re0

(a)

Eigenvalues of A: small magnitude of real componentsEigenvalues of A: large magnitude of real components

𝐀𝐀 = βˆ’π‚π‚βˆ’πŸπŸπ†π†

11

Page 12: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Standard Krylov subspace (MEXP) (a) Standard Krylov Basis (MEXP):

π‘²π‘²π’Žπ’Ž 𝐀𝐀,𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π―

Im

Re0

(a)

β€’ Fast mode of dynamical behavior of circuits.β€’ Standard Krylov basis tends to capture these

eigenvalues with large magnitude.

Eigenvalues of A: small magnitude of real componentsEigenvalues of A: large magnitude of real components 12

Page 13: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Standard Krylov subspace (MEXP) (a) Standard Krylov Basis (MEXP):

π‘²π‘²π’Žπ’Ž 𝐀𝐀,𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π―

Im

Re0

(a)

β€’ These eigenvalues defines the major dynamical behavior of circuits.

β€’ Demand more bases in order to characterize these eigenvalues

Eigenvalues of A: small magnitude of real componentsEigenvalues of A: large magnitude of real components 13

Page 14: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Inverted Krylov subspace (I-MATEX) (a) Standard Krylov Basis (MEXP):

π‘²π‘²π’Žπ’Ž 𝐀𝐀,𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π― (b) Inverted Krylov Basis (I-MATEX)

π‘²π‘²π’Žπ’Ž π€π€βˆ’πŸπŸ, 𝐯𝐯 = 𝐯𝐯,π€π€βˆ’πŸπŸπ―π―,π€π€βˆ’πŸπŸ 𝐯𝐯, … ,π€π€βˆ’π’Žπ’Ž+𝟏𝟏𝐯𝐯

Im

Re

Im

Re00

(a) (b)

Eigenvalues of A: small magnitude of real componentsEigenvalues of A: large magnitude of real components 14

Page 15: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Inverted Krylov subspace (I-MATEX) (a) Standard Krylov Basis (MEXP):

π‘²π‘²π’Žπ’Ž 𝐀𝐀,𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π― (b) Inverted Krylov Basis (I-MATEX)

π‘²π‘²π’Žπ’Ž π€π€βˆ’πŸπŸ, 𝐯𝐯 = 𝐯𝐯,π€π€βˆ’πŸπŸπ―π―,π€π€βˆ’πŸπŸ 𝐯𝐯, … ,π€π€βˆ’π’Žπ’Ž+𝟏𝟏𝐯𝐯

Im

Re

Im

Re00

(a) (b)

Inverted Krylov subspace is more likely to capture these β€œimportant” eigenvalues

Eigenvalues of A: small magnitude of real componentsEigenvalues of A: large magnitude of real components 15

Page 16: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Rational Krylov subspace (R-MATEX) (a) Standard Krylov Basis (MEXP):

π‘²π‘²π’Žπ’Ž 𝐀𝐀,𝐯𝐯 = 𝐯𝐯,𝐀𝐀𝐯𝐯,π€π€πŸπŸπ―π―, … ,π€π€π’Žπ’Žβˆ’πŸπŸπ―π― (c) Rational Krylov Basis (R-MATEX)

π‘²π‘²π’Žπ’Ž (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸ,𝐯𝐯 = 𝐯𝐯, (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸπ―π―, (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸ 𝐯𝐯, … , (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’π’Žπ’Ž+𝟏𝟏𝐯𝐯

Im

Re

Im

Re

Eigenvalues of A: small magnitude of real componentsEigenvalues of A: large magnitude of real components

00

(a) (c)

β€’ Rational Krylov is still likely to capture these β€œimportant” eigenvalues

β€’ More robust numerical property

16

Page 17: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Error trend of R-MATEX

Directly compute π‘’π‘’β„Žπ€π€ MEVP via R-MATEX

𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 = |π‘’π‘’β„Žπ€π€π―π― βˆ’ π•π•πππ‘’π‘’β„Žπ‡π‡πππ‘’π‘’1| vs. m vs. h

Erro

r

17

Page 18: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Same Algorithm with Different Input Matrices

Still only one 𝐋𝐋,𝐔𝐔 = π₯π₯𝐁𝐁_𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝(π—π—πŸπŸ)

π—π—πŸπŸ π—π—πŸπŸ π‡π‡π’Žπ’Ž

MEXP 𝐂𝐂 𝐆𝐆 π‡π‡π’Žπ’Ž

I-MATEX 𝐆𝐆 𝐂𝐂 π‡π‡π‡π‘šπ‘šβˆ’1

R-MATEX 𝐂𝐂 + πœΈπœΈπ†π† 𝐂𝐂 (𝐈𝐈 βˆ’ οΏ½π‡π‡π‘šπ‘šβˆ’1)/𝜸𝜸

18

Page 19: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Testcases: RC Circuits with Different Stiffnessma: average dimension of Krylov subspace (Vm, Hm)mp: peak dimension of Krylov subspace (Vm, Hm)Err(%): relative error compared to reference solution.Speedups brought by Krylov subspace reduction

Stiffness:|𝑅𝑅𝑒𝑒{πœ†πœ†π‘šπ‘šπ‘šπ‘šπ‘šπ‘š 𝐴𝐴 }||𝑅𝑅𝑒𝑒{πœ†πœ†π‘šπ‘šπ‘Žπ‘Žπ‘Žπ‘Ž 𝐴𝐴 }|

Method π‘šπ‘šπ‘Žπ‘Ž π‘šπ‘šπ‘π‘ Err(%) Speedup/MEXP StiffnessMEXP 211.4 229 0.510 1X

2.1X1016I-MATEX 5.7 14 0.004 2616X

R-MATEX 6.9 12 0.004 2735X

MEXP 154.2 224 0.004 1X

2.1X1012I-MATEX 5.7 14 0.004 583X

R-MATEX 6.9 12 0.004 611X

MEXP 148.6 223 0.004 1X

2.1X108I-MATEX 5.7 14 0.004 229X

R-MATEX 6.9 12 0.004 252X19

Page 20: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Problem #2: Initial Vector Change MEVP= 𝑒𝑒𝐀𝐀𝐯𝐯

Once 𝐯𝐯 changes, we need to compute π‘²π‘²π’Žπ’Ž for MEVP.

initial vector of π‘²π‘²π’Žπ’Ž (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸ,𝐯𝐯

20

Page 21: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Problem #2: Initial Vector Change

changes when input sources cannot keep the previous trend

MEVP= 𝑒𝑒𝐀𝐀𝐯𝐯Once 𝐯𝐯 changes, we need to compute π‘²π‘²π’Žπ’Ž for MEVP.In circuit solver,

𝐱𝐱 𝑑𝑑 + β„Ž = π‘’π‘’β„Žπ€π€ (𝐱𝐱 𝑑𝑑 + 𝐅𝐅 𝑑𝑑,β„Ž) βˆ’ 𝐏𝐏 𝑑𝑑,β„Ž

where𝐅𝐅 𝑑𝑑,β„Ž = π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + π€π€βˆ’πŸπŸ

𝐛𝐛 𝑑𝑑 + β„Ž βˆ’ 𝐛𝐛 π‘‘π‘‘β„Ž

initial vector of π‘²π‘²π’Žπ’Ž (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸ,𝐯𝐯

initial vector

21

Page 22: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Problem #2: Initial Vector Change MEVP= 𝑒𝑒𝐀𝐀𝐯𝐯

Once 𝐯𝐯 changes, we need to compute π‘²π‘²π’Žπ’Ž for MEVP.

𝐅𝐅 𝑑𝑑,β„Ž = π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + β„Ž βˆ’ 𝐛𝐛 𝑑𝑑

β„Ž

initial vector of π‘²π‘²π’Žπ’Ž (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸ,𝐯𝐯

A pulse input example, β€’ the dash lines are places where initial vector changesβ€’ β€œtransition spot”

changes when input sources cannot keep the previous trend

22

Page 23: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Problem #2: Initial Vector Change

changes when input sources cannot keep the previous trend

MEVP= 𝑒𝑒𝐀𝐀𝐯𝐯Once 𝒗𝒗 changes, we need to compute π‘²π‘²π’Žπ’Ž for MEVP.In circuit solver,

𝐱𝐱 𝑑𝑑 + β„Ž = π‘’π‘’β„Žπ€π€ (𝐱𝐱 𝑑𝑑 + 𝐅𝐅 𝑑𝑑,β„Ž) βˆ’ 𝐏𝐏 𝑑𝑑,β„Ž

where𝐅𝐅 𝑑𝑑,β„Ž = π€π€βˆ’πŸπŸπ›π› 𝑑𝑑 + π€π€βˆ’πŸπŸ

𝐛𝐛 𝑑𝑑 + β„Ž βˆ’ 𝐛𝐛 π‘‘π‘‘β„Ž

initial vector of π‘²π‘²π’Žπ’Ž (𝐈𝐈 βˆ’ 𝛾𝛾𝐀𝐀)βˆ’πŸπŸ,𝐯𝐯

initial vector

Many input current sources in PDN make the initial vector change frequently, which triggers Krylov subspace generations and consumes runtime (trouble maker).

23

Page 24: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Next Section Problem Formulation

MATEX Framework Circuit Solver

Matrix Exponential KernelKrylov Subspace Accelerations for PDNs

Distributed Framework Linear system’s Superposition Property and

Parallel Processing Reduce Krylov Subspace Computations

Experimental Results

Conclusions 24

Page 25: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Input sources, the trouble maker

A PDN with three input current sources.

25

Page 26: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Input sources, the trouble maker

A PDN with three input current sources.

26

Page 27: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Input sources, the trouble maker

Some definitions

Local Transition Spot (LTS): for oneinput source, its transition spots.

Global Transition Spot (GTS): theunion of all LTS.

Snapshot: for one input source, the spot in GTS but not in LTS.

A PDN with three input current sources.

27

Page 28: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Input sources, the trouble maker

Some definitions

Local Transition Spot (LTS): for one input source, its transition spots.

Global Transition Spot (GTS): theunion of all LTS

Snapshot: for one input source, the spot in GTS but not in LTS

A PDN with three input current sources.

Simulating circuit with input sources as a whole, GTS triggers Krylov subspace generations.

28

Page 29: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Input sources, the trouble maker

How about simulating the circuit with individual source, then sum them up later by superposition?

A PDN with three input current sources.

Some definitions

Local Transition Spot (LTS): for oneinput source, its transition spots.

Global Transition Spot (GTS): theunion of all LTS.

Snapshot: for one input source, the spot in GTS but not in LTS.

29

Page 30: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Reduce the Krylov subspace generation chances and reuse subspace

For one input source, LTS is much smaller than GTS.

Meanwhile, the snapshot is needed to keep track for later superposition.

Compute snapshot without extra Krylov subspace generations.

30

Page 31: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Reduce the Krylov subspace generation chances and reuse subspace

Given an previous solution x(t)

𝐱𝐱 𝑑𝑑

31

Page 32: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Reduce the Krylov subspace generation chances and reuse subspace

To compute the solution at snapshot 𝐱𝐱 𝑑𝑑 + β„Ž1 and 𝐱𝐱 𝑑𝑑 + β„Ž2 without Krylov subspace generations

𝐱𝐱 𝑑𝑑 + β„Ž1

𝐱𝐱 𝑑𝑑 + β„Ž2

β„Ž1

β„Ž2

32

Page 33: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Reduce the Krylov subspace generation chances and reuse subspace

Generate 𝐕𝐕𝐝𝐝 and 𝐇𝐇𝐝𝐝 at t

𝐕𝐕𝐝𝐝,π‡π‡π’Žπ’Žπ‘‘π‘‘

33

Page 34: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Reduce the Krylov subspace generation chances and reuse subspace

Use 𝐕𝐕𝐝𝐝, 𝐇𝐇𝐝𝐝 and scaling h to h1, and h2 for MEVP, until reach the next LTS

No matrix factorizations during this adaptive stepping!

𝐱𝐱 𝑑𝑑 + β„Ž2 = ||𝐯𝐯||π•π•πππ‘’π‘’β„Ž2π‡π‡π‘šπ‘šπ’†π’†πŸπŸ βˆ’ 𝑷𝑷(𝑑𝑑, β„ŽπŸπŸ)

β„Ž2

𝐱𝐱 𝑑𝑑 + β„Ž1 = ||𝐯𝐯||π•π•πππ‘’π‘’β„Ž1π‡π‡π‘šπ‘šπ’†π’†πŸπŸ βˆ’ 𝑷𝑷(𝒕𝒕,β„Ž1)β„Ž1𝐕𝐕𝐝𝐝,π‡π‡π’Žπ’Ž

34

Page 35: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

MATEX’s Distributed Framework

35

Page 36: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

More aggressive! Each computing node is responsible for one set of bumps.

36

Page 37: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Experimental Results Test cases: IBM power grid benchmarks

TR: Trapezoidal method with fixed time step MATEX: circuit solver uses R-MATEX

Environment Linux workstations, Intel CoreTM i7-4770 3.40GHz processor 32GB memory. Implemented in MATLAB 2013. Easy to emulate distributed environment (no

synchronization during the simulation).

37

Page 38: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Experimental Results

DesignMATEX

# Grp trmatex(s) trtotal(s) AvgErr.

Speedups t1000(s)/trmatex(s)

Speedups ttotal(s)/trtotal(s)

ibmpg1t 100 0.50 0.85 2.5E-5 11.9X 7.3Xibmpg2t 100 2.02 3.72 4.3E-5 13.4X 7.7Xibmpg3t 100 20.15 45.77 3.7E-5 12.2X 6.0XIbmpg4t 15 22.35 65.66 3.9E-5 14.7X 5.6Xibmpg5t 100 35.67 54.21 1.1E-5 11.5X 7.9Xibmpg6t 100 47.27 74.94 3.4E-5 11.5X 7.6X

DesignTR with h=10pst1000(s) tttotal(s)

ibmpg1t 5.94 6.20ibmpg2t 26.98 28.61ibmpg3t 245.92 272.47Ibmpg4t 329.36 368.55ibmpg5t 408.78 428.43ibmpg6t 542.04 567.38

β€’ Avg Err.: average differences compared to all output nodes' solutions provided by IBM Power Grid Benchmarks;

β€’ Speedups t1000/trmatex : transient stepping runtime speedups of MATEX over TR;

β€’ Speedups tttotal/trtotal : total simulation runtime speedups of MATEX over TR.

38

Page 39: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Experimental Results

DesignMATEX

# Grp trmatex(s) trtotal(s) AvgErr.

Speedups t1000(s)/trmatex(s)

Speedups tttotal(s)/trtotal(s)

ibmpg1t 100 0.50 0.85 2.5E-5 11.9X 7.3Xibmpg2t 100 2.02 3.72 4.3E-5 13.4X 7.7Xibmpg3t 100 20.15 45.77 3.7E-5 12.2X 6.0XIbmpg4t 15 22.35 65.66 3.9E-5 14.7X 5.6Xibmpg5t 100 35.67 54.21 1.1E-5 11.5X 7.9Xibmpg6t 100 47.27 74.94 3.4E-5 11.5X 7.6X

DesignTR with h=10pst1000(s) tttotal(s)

ibmpg1t 5.94 6.20ibmpg2t 26.98 28.61ibmpg3t 245.92 272.47Ibmpg4t 329.36 368.55ibmpg5t 408.78 428.43ibmpg6t 542.04 567.38

β€’ Avg Err.: average differences compared to all output nodes' solutions provided by IBM Power Grid Benchmarks;

β€’ Speedups t1000/trmatex : transient stepping runtime speedups of MATEX over TR;

β€’ Speedups tttotal/trtotal : total simulation runtime speedups of MATEX over TR.

39

Page 40: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Experimental Results

DesignMATEX

# Grp trmatex(s) trtotal(s) AvgErr.

Speedups t1000(s)/trmatex(s)

Speedups tttotal(s)/trtotal(s)

ibmpg1t 100 0.50 0.85 2.5E-5 11.9X 7.3Xibmpg2t 100 2.02 3.72 4.3E-5 13.4X 7.7Xibmpg3t 100 20.15 45.77 3.7E-5 12.2X 6.0XIbmpg4t 15 22.35 65.66 3.9E-5 14.7X 5.6Xibmpg5t 100 35.67 54.21 1.1E-5 11.5X 7.9Xibmpg6t 100 47.27 74.94 3.4E-5 11.5X 7.6X

DesignTR with h=10pst1000(s) tttotal(s)

ibmpg1t 5.94 6.20ibmpg2t 26.98 28.61ibmpg3t 245.92 272.47Ibmpg4t 329.36 368.55ibmpg5t 408.78 428.43ibmpg6t 542.04 567.38

β€’ Avg Err.: average differences compared to all output nodes' solutions provided by IBM Power Grid Benchmarks;

β€’ Speedups t1000/trmatex : transient stepping runtime speedups of MATEX over TR;

β€’ Speedups tttotal/trtotal : total simulation runtime speedups of MATEX over TR.

40

Page 41: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

Contributions New time-integration kernel is applied with improved

Krylov subspace-based MEVP approximations for PDNs Adaptive time stepping without matrix re-factorization

during the transient (stepping) simulation This feature cannot be achieved in low order approximation strategy,

e.g., trapezoidal (TR), due to the explicitly embedded β„Ž in π‚π‚β„Ž

+ 𝐆𝐆2

Distributed computing framework Decompose simulation task based on LTS, then do superposition

using GTS and snapshot to form the final solution. Explore the advantages of large time stepping, also reduce and

reuse Krylov subspaces.

Results of IBM PG benchmarks Compared to TR with fixed time step (10ps), the speedup of transient

stepping is 13X on average.

41

Page 42: DAC14_MATEX_PowerDistributionNetworkSimulationSlides

THANK YOU

42