19
ollaboration of Parafrase-2 and NaraVie for Effective Parallelization Supports

Collaboration of Parafrase-2 and NaraView for Effective Parallelization Supports

  • Upload
    menora

  • View
    55

  • Download
    0

Embed Size (px)

DESCRIPTION

Collaboration of Parafrase-2 and NaraView for Effective Parallelization Supports. Background. According to parallelizing compiler development, necessity of parallelization support tools increases. We developed NaraView that visualizes program - PowerPoint PPT Presentation

Citation preview

Collaboration of Parafrase-2 and NaraViewfor Effective Parallelization Supports

Purpose

Validation of NaraView as a parallelization support tool

Background

According to parallelizing compiler development,necessity of parallelization support tools increases.We developed NaraView that visualizes program information as a parallelization support tool.

Contents

1. NaraView

2. Validation Method

3. Result of Parallelization

4. Validation

5. Conclusion

1. NaraView

Visualize internal representations of parallelizing compiler Parafrase-2

4 Types of View

1. Program Structure View2. Source Code View3. Hierarchical CFG View4. Data Dependence View

They provide interactive parallelization     between users and a compiler

Parafrase-2

Parafrase-2 is a parallelizing compiler that was developed in University of Illinois.

Passfile:It specifiesprogram analysis and optimization in the file

Program

Parafrase-2

Parallelized Program

User’s strategy decides execution efficiency

Program Structure View

Loop

Expression

Condition

Expression (concurrent)

Data Dependence View

write access

read access

write & read access

data dependence

2. Method of Validation

Parallelize a real application with Parafrase-2 and NaraView in 3 distinguished ways

Application Program

Extended Huckel calculation program

3 Ways of Parallelization

1. Use only default passfile2. Use a modified passfile3. Use a modified passfile and hand optimize

Finally, compare three results and validate them

Extended Huckel calculation program

Optimize molecular geometry parameters.It consists of 8 subroutines.

clean.fsortds.fsortdk.fmatrix.fmatpr.fsmplhk.fsrtvdd.f

hoqrv2.f

Simple calculation

Main calculation

Case1: Use only default passfile

Some non-parallelized loops remain

Result (Program Structure View)

3. Result of Parallelization

To Eliminate Data Dependence

Add adequate code transformations into default passfile

Code Transformation: modify internal representations to optimize program execution time

How to decide adequate code transformations?

Observe data dependencies in the program in detail

Program structure view & data dependence view of NaraView are useful to observe them

Case2: Use a modified passfile

DO 10 j= k + 1, n 13 w (j,1) = 0.d0 14 e (j) = a (k,j) 1510 s = e (j) * e (j) + s 16

Example (line-13)

non-parallelized loop(program structure view)

Observe data dependence view corresponding this loop to parallelize

w

a

e

s

Loop Distribution Scalar Expansion

Add the two transformations into default passfile

Result of Case2

Program Structure View

Two more loops are parallelized

Case3: Use a Modified Passfile and Hand Optimize

DO 347 j = k + 1,n 21 e(j) = a(k,j) 22 cptmp_346(j) = e(j) * e(j) 1 +cptmp_346(j-1) 23347CONTINUE 24

Example (line-21)

Display corresponding data dependence view

DO 347 j = k + 1,n 21 e(j) = a(k,j) 22 cptmp_346(j) = e(j) * e(j) 1 +cptmp_346(j-1) 23347CONTINUE 24

DO 347 j = k + 1,n 22 e(j) = a(k,j) 23347 CONTINUE 24DO 390 j = k + 1,n2 25 cptmp_346(j) = e(j) * e(j) 1 + cptmp_346(j - 1) 26390 CONTINUE 27

Example of Modification (line-21)

After the modification, Parafrase-2 can parallelize this loop.

Result of Case3

Program Structure View

4. Validation

Case1 Case3Case2

Parallelization process at line-13

Effectiveness of each method

case1 case2 case3

line-13

line-69 2

2

6213 2 nn 822 nn )2(5 n12 nn 2

12 nn

The Number of Loop Iterations

5. Conclusion

Effectiveness of visualized program information is indicated by these results

Future Work

It is difficult to decide adequate transformation for parallelization.

Propose candidates of transformation

Select one from them with visualized program information