Calculating Mpi Pi

8/11/2019 Calculating Mpi Pi

1/13

Calculating !in Parallel

Using MPIAiichiro Nakano

Collaboratory for Advanced Computing & Simulations

Department of Computer Science

Department of Physics & Astronomy

Department of Chemical Engineering & Materials ScienceUniversity of Southern California

Email: [email protected]. Task decomposition (parallel programming= who does what)2. Scalability analysis


2/13

Integral Representation of

dx

4

1+ x2=

d"

cos2"

4

1+ tan2"

0

# / 4

$ =01

$ 4d"0# / 4

$ = #


3/13

Numerical Integration of

Integration

Discretization:

= 1/N: step = 1/NBINx

i= (i+0.5)

(i= 0,,N-1)

4

1+ x2dx

0

1" = #

4

1+ xi2

i=0

N"1

# $ % &

#include

#define NBIN 10000000

void main() {

int i; double step,x,sum=0.0,pi;

step = 1.0/NBIN;

for (i=0; i


4/13

Parallelization: Who Does What?

...for (i=myid; i


5/13

Parallel Running Timeglobal_pi.c: NBIN = 107, on hpc-login2

How Efficient Is the Parallel Program?

#PBS -l nodes=16:ppn=1,arch=x86_64

...

np=$(cat $PBS_NODEFILE | wc -l)

mpirun -np $np -machinefile $PBS_NODEFILE ./global_pi

mpirun -np 8 -machinefile $PBS_NODEFILE ./global_pi





6/13

Parallel Efficiency

Execution time:T(W,P)W: WorkloadP: Number of processors

Speed:

Speedup:

Efficiency:

How to scale WPwithP?

S(W,P) =W

T(W,P)

SP =S(WP,P)

S(W1,1)=

WPT(W1,1)

W1T(WP,P)

EP =SP

P=

WPT(W

1,1)PW1T(WP,P)


7/13

Fixed Problem-Size Scaling

WP= Wconstant (strong scaling)

Speedup:

Efficiency:

Amdahls law:f(= sequential fraction of the workload)limits the asymptotic speedup

SP =T(W,1)

T(W,P)

EP =T(W,1)

PT(W,P)

T(W,P) = fT(W,1) +(1" f)T(W,1)

P

"SP =T(W,1)

T(W,P)=

1

f + (1# f) /P

"SP # 1

fP#$( )


8/13

Isogranular Scaling

WP= Pw(weak scaling)w= constant workload per processor (granularity)

Speedup:

Efficiency:

SP =S(P w,P)

S(w,1)=P w /T(P w,P)

w /T(w,1)=

P T(w,1)

T(P w,P)

EP =SP

P=

T(w,1)

T(P w,P)


9/13

Analysis of Global_Pi Program

Workload#

Number of quadrature points,N (or NBIN in

the program)

Parallel execution time onPprocessors:

> Local computation#

N/P

> Butterfly computation/communication in global()#

logP

T(N,P) = Tcomp(N,P)+ Tglobal(P)

="N

P

+ #logP

for (i=myid; i


10/13


Speedup:

Efficiency:

SP =T(N,1)

T(N,P)=

"N

"N/P + #logP=

P

1+#

"

P logP

N

EP=

SP

P=

1

1+"

#

P logP

N

global_pi.c: N= 107, on hpc-login2

SP =T(N,1)

T(N,P)

T(N,P) vs. P

EP =T(N,1)

PT(N,P)


11/13


Speedup model:EP

=

SP

P=

1

1+ "#

P logP

N

global_pi.c: N= 107, on hpc-login2


12/13

Runtime Variance among Ranks


13/13

Isogranular Scaling

n=N/P= constant

Efficiency:

global_pi_iso.c: N/P= 107, on HPC

EP =T(n,1)

T(nP,P)=

an

"n + #logP=

1

1+#

"nlogP

T(P n

,P

) vs. P

EP =T

(n,1)

T(P n,P)

Documents

Calculating Mpi Pi