Function Level Parallelism Lead by Data Dependencies

Preview:

Citation preview

Function Level Parallelism Lead by Data Dependencies

Sean Rul, Hans Vandierendonck and Koen De Bosschere

Ghent University, ELIS-PARIS, Sint-Pietersnieuwstraat 41, 9000 Gent, Belgium

Sean Rul is supported by the Institute for the Promotion of Innovation through Science and Technology in Flanders

sean.rul@UGent.be http://www.elis.ugent.be/~srul

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Compression Decompression Total

Sp

eed

up

Original Heterogeneous Homogeneous

ConclusionConclusionConclusionConclusionResultsResultsResultsResults

ProblemProblemProblemProblem ApplicationsApplicationsApplicationsApplications

mmmm

ffff

hhhh

gggg

iiii llll

jjjj kkkk

Intercluster data stream

Intracluster data stream

FPGA

MethodMethodMethodMethod

Matching parallel constructsMatching parallel constructsMatching parallel constructsMatching parallel constructs

Call GraphCall GraphCall GraphCall Graph InterproceduralInterproceduralInterproceduralInterprocedural Data Flow GraphData Flow GraphData Flow GraphData Flow Graph Data Sharing GraphData Sharing GraphData Sharing GraphData Sharing Graph

Abstracting profiled informationAbstracting profiled informationAbstracting profiled informationAbstracting profiled information

ParallelizingParallelizingParallelizingParallelizing

Too much information

ProfileProfileProfileProfile

Sequential programSequential programSequential programSequential program

Multithreaded Multithreaded Multithreaded Multithreaded programprogramprogramprogram

Hybrid Hardware / Software or embedded

systems

Data Partitioning onCell processor

Besides parallelizing sequential programs:

Program Bzip2 (SPEC2000) with reference input

Executed on a quad Itanium® system

x 10

x 10 x 20x 20

x 100 x 30 x 20 x 10

1%mmmm

ffff

hhhh

gggg

iiii llll

jjjj kkkk20%

14% 15%

15% 10% 10%

14%

# executions

% execution time

Read

mmmm

ffff

hhhh

gggg

iiii llll

jjjj kkkk

dsdsdsds1111

dsdsdsds4444

dsdsdsds7777

dsdsdsds8888

dsdsdsds5555

dsdsdsds6666

dsdsdsds9999

dsdsdsds2222 dsdsdsds3333

Cluster privateCluster shared

Write

•New microprocessor generation: Increase in parallel computing power

•Sequential programs: Cannot exploit these resources

•Parallelizing by hand: Difficult and time consuming

•Let the compiler do it:Setup framework for parallelism detection

•Call graph and interprocedural data flow graphare useful for detecting parallel constructs

•Data sharing graph reveals data affinitybetween functions

•Future work:- Find new parallel constructs- Investigate bidirectional data streams

Look for a balanced solution

Detect for examplea data pipeline

Minimize communication between threads Add synchronization and initialization code

Elliptic node:Data structure

Rectangular node: Function

Recommended