33
Parallel Applications Parallel Hardware Hardware Parallel Software IT industry (Silicon Valley) Users (Silicon Valley) Enforcing Textual Alignment of Collectives using Dynamic Checks Collectives using Dynamic Checks Amir Kamil and Katherine Yelick UC Berkeley Parallel Computing Laboratory October 10, 2009

Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Parallel Applications

Parallel HardwareHardware

Parallel SoftwareIT industry

(Silicon Valley)Users

(Silicon Valley)

Enforcing Textual Alignment of Collectives using Dynamic ChecksCollectives using Dynamic Checks

Amir Kamil and Katherine YelickUC Berkeley Parallel Computing Laboratory

October 10, 2009

Page 2: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Overview

Synchronization analysis is critical to many analyses and optimizations

Race detection, lock elimination, memory model enforcement

Collective operations used by many parallel p y y pprograms for communication/synchronization

Operations that are executed collectively by all or aOperations that are executed collectively by all or a subset of all threads

Alignment restrictions on collectives affectAlignment restrictions on collectives affect programmability and analyzabilityD i li t h ki i b thDynamic alignment checking increases both

2Amir Kamil LCPC'09, Newark, DE

Page 3: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Titanium Language

Titanium is a high performance dialect of JavaPartitioned global address space memoryPartitioned global address space memory modelUses single program multiple data model ofUses single-program, multiple data model of parallelism

Program StartProgram Start

BarrierBarrier

Exchange

Amir Kamil 3LCPC'09, Newark, DE

g

Page 4: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Collective Operations

Collectives used for synchronization and synchronous communicationBarrier: all threads must reach it before anyBarrier: all threads must reach it before any can proceed

Broadcast: one to all communication

Exchange: all to all communication

Amir Kamil 4LCPC'09, Newark, DE

Page 5: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Collective Alignment

Many parallel languages make no attempt to ensure that collectives line up

Example code that is legal but will deadlock:if (Ti.thisProc() % 2 == 0)

Ti.barrier(); // even ID threadselse

; // odd ID threadsint i = broadcast Ti.thisProc() from 0;

5Amir Kamil LCPC'09, Newark, DE

Page 6: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Textual Collective Alignment

Titanium has textual collectives: all threads must execute the same textual sequence of collectives

Stronger guarantee than structural correctness – this example is illegal:if (Ti.thisProc() % 2 == 0)

Ti.barrier(); // even ID threadselse

Ti.barrier(); // odd ID threads

Language semantics statically guarantee textual alignmenttextual alignment

6Amir Kamil LCPC'09, Newark, DE

Page 7: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Control Flow Restrictions

A statement may have global effects if it or any substatement is a collective operation or calls a method declared as globalTextual alignment requires the following

If any branch of a conditional may have global effectsIf any branch of a conditional may have global effects, then all threads must take the same branchIf the body/test of a loop may have global effects, thenIf the body/test of a loop may have global effects, then all threads must execute the same number of iterations

7Amir Kamil LCPC'09, Newark, DE

Page 8: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Single-Valued Expressions

A single-valued expression has coherentvalues on all threads when evaluated

Example: Ti.numProcs() > 1

All threads guaranteed to take the same branch of a conditional guarded by a single-branch of a conditional guarded by a singlevalued expression

Only such conditionals may have collectivesOnly such conditionals may have collectivesif (Ti.numProcs() > 1)

Ti barrier(); // multiple threadsTi.barrier(); // multiple threadselse

; // only one thread; // only one thread

8Amir Kamil LCPC'09, Newark, DE

Page 9: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

The single Type System

Titanium’s single type system determines which expressions are single-valuedBasic rule is that single-valued expressionsBasic rule is that single valued expressions may only be computed from other single-valued expressionsvalued expressionsLiterals and certain constant expressions (e g Ti P ()) are single al ed(e.g. Ti.numProcs()) are single-valuedRules for method calls, objects/fields, and exceptions can be very complicated

9Amir Kamil LCPC'09, Newark, DE

Page 10: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Drawbacks to single

Type system is cumbersome and difficult to understand

Bugs found as recently as 2006!

Annotations put a burden on programmerType system still has problemsType system still has problems

Unchecked casts to singlei l h i l t i h li d tsingle has incomplete meaning when applied to

array-based containere g an object of type String single only has the samee.g. an object of type String single only has the same length on all threads, but not necessarily the same contents

No natural way to extend it to allow

10Amir Kamil LCPC'09, Newark, DE

No natural way to extend it to allow collectives on thread subsets

Page 11: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Alignment Checking Schemes

Comparison of different alignment schemes

Programmer b d

Restrictionson program

Early error Accuracy/

i i

Compiler/ runtime Performance

d tiSubset

tburden on program structure

error detection precision runtime

complexity reduction support

Type Hi h Hi h Hi h Hi h M di N NType system High High High High Medium No No

Static L M di Hi h M di Hi h N Yanalysis Low Medium High Medium High No Yes

Dynamic Low High Medium High Low Yes Yesychecks Low High Medium High Low Yes Yes

No Low Low None None None No Yes

11

checking Low Low None None None No Yes

Amir Kamil LCPC'09, Newark, DE

Page 12: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Dynamic Enforcement

A dynamic enforcement scheme can reduce programmer burden but still provide accurate results for analysis and optimizationy pBasic idea:

Track control flow on all threadsTrack control flow on all threadsPrior to performing a collective, check that preceding control flow matches on all threadscontrol flow matches on all threads

Compiler instruments source code to perform tracking and checkingtracking and checking

12Amir Kamil LCPC'09, Newark, DE

Page 13: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Alignment Tracking

Track conditionals and loops at runtimeOnly statements that may have global effects need be tracked

Execution history is saved on each threadOnly in debugging mode to generate better error y gg g gmessages

A running hash is computed for each threadA running hash is computed for each thread that summarizes its execution

13Amir Kamil LCPC'09, Newark, DE

Page 14: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Tracking Example

5 if (Ti thisProc() == 0)0, 1

5 if (Ti.thisProc() == 0)6 globalMethod();7 else7 else8 globalMethod();9 Ti barrier();9 Ti.barrier();

Thread Hash Execution HistoryThread Hash Execution History

0 0x0dc7637a *0 0x0dc7637a …

1 0x0dc7637a …*

14Amir Kamil LCPC'09, Newark, DE

* Entries prior to line 5

Page 15: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Tracking Example

5 if (Ti thisProc() == 0)5 if (Ti.thisProc() == 0)6 globalMethod();7 else

0

7 else8 globalMethod();9 Ti barrier();

1

9 Ti.barrier();

Thread Hash Execution HistoryThread Hash Execution History

0 0x7e8a6fa0 * (5 then)0 0x7e8a6fa0 … , (5, then)

1 0x2027593c …*, (5, else)

15Amir Kamil LCPC'09, Newark, DE

, ( , )

* Entries prior to line 5

Page 16: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Tracking Example

5 if (Ti thisProc() == 0)5 if (Ti.thisProc() == 0)6 globalMethod();7 else7 else8 globalMethod();9 Ti barrier();

0, 19 Ti.barrier();

Thread Hash Execution HistoryThread Hash Execution History

0 0x307ea68d * (5 then) **0 0x307ea68d … , (5, then), …

1 0xfe4a0bc3 …*, (5, else), …**

16Amir Kamil LCPC'09, Newark, DE

, ( , ),

* Entries prior to line 5 ** Entries from call to globalMethod()

Page 17: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Alignment Checking

Thread alignment is compared prior to each global operationHash value is broadcast from thread 0Hash value is broadcast from thread 0

If any thread’s value differs, error generated

In debugging mode saved history used toIn debugging mode, saved history used to determine location in code where control flow di ddiverged

Saved history can be erased at each collective, since di t l fl t t h f ll ti tpreceding control flow must match for collective to

execute

17Amir Kamil LCPC'09, Newark, DE

Page 18: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Checking Example

5 if (Ti thisProc() == 0)5 if (Ti.thisProc() == 0)6 globalMethod();7 else7 else8 globalMethod();9 Ti barrier();

0, 19 Ti.barrier();

Threa H h H h f 0 E ti Hi tThread Hash Hash from 0 Execution History

0 0x307ea68d * (5 then) **0 0x307ea68d … , (5, then), …

1 0xfe4a0bc3 …*, (5, else), …**

18Amir Kamil LCPC'09, Newark, DE

1 0xfe4a0bc3 … , (5, else), …

* Entries prior to line 5 ** Entries from call to globalMethod()

Page 19: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Checking Example

5 if (Ti thisProc() == 0)5 if (Ti.thisProc() == 0)6 globalMethod();7 else7 else8 globalMethod();9 Ti barrier();0 1 9 Ti.barrier();

Threa H h H h f 0 E ti Hi t

0, 1

Thread Hash Hash from 0 Execution History

0 0x307ea68d 0x307ea68d * (5 then) **0 0x307ea68d 0x307ea68d … , (5, then), …

1 0xfe4a0bc3 0x307ea68d …*, (5, else), …**

19Amir Kamil LCPC'09, Newark, DE

1 0xfe4a0bc3 0x307ea68d … , (5, else), …

* Entries prior to line 5 ** Entries from call to globalMethod()

Page 20: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Checking Example

5 if (Ti thisProc() == 0)5 if (Ti.thisProc() == 0)6 globalMethod();7 else7 else8 globalMethod();9 Ti barrier();0 1 9 Ti.barrier();

Threa H h H h f 0 E ti Hi t

0, 1

Thread Hash Hash from 0 Execution History

0 0x307ea68d 0x307ea68d * (5 then) **ERRORERROR0 0x307ea68d 0x307ea68d … , (5, then), …

1 0xfe4a0bc3 0x307ea68d …*, (5, else), …**

ERRORERROR

20Amir Kamil LCPC'09, Newark, DE

1 0xfe4a0bc3 0x307ea68d … , (5, else), …

* Entries prior to line 5 ** Entries from call to globalMethod()

Page 21: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Checking Example

5 if (Ti thisProc() == 0)5 if (Ti.thisProc() == 0)6 globalMethod();7 else7 else8 globalMethod();9 Ti barrier();0 1 9 Ti.barrier();

Threa H h H h f 0 E ti Hi t

0, 1

SA GSA GThread Hash Hash from 0 Execution History

0 0x307ea68d 0x307ea68d * (5 then) **ERRORERROR

MISALIGNMENTMISALIGNMENT

0 0x307ea68d 0x307ea68d … , (5, then), …

1 0xfe4a0bc3 0x307ea68d …*, (5, else), …**

ERRORERROR

21Amir Kamil LCPC'09, Newark, DE

1 0xfe4a0bc3 0x307ea68d … , (5, else), …

* Entries prior to line 5 ** Entries from call to globalMethod()

Page 22: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Example

5 if (Ti.thisProc() == 0) // misaligned6 globalMethod();7 else8 globalMethod();9 Ti.barrier(); // failure

Error message (debugging mode)ti.lang.Alignment.AlignmentError: collective alignment g g g gfailed on processor 1 at foo.java:9:8last location: else branch at foo.java:5:12last location on processor 0: then branch at foo.java:5:12previous location: none

22Amir Kamil LCPC'09, Newark, DE

Page 23: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Strict vs. Weak Alignment

Strict alignment guarantees alignment of statements that statically may have global effectsWeak alignment only guarantees alignment of statements that dynamically executeof statements that dynamically execute collective operations

Under weak alignment if a tracked statement doesUnder weak alignment, if a tracked statement does not actually execute a collective, then it must be erased from the hash and execution history aftererased from the hash and execution history after completing

23Amir Kamil LCPC'09, Newark, DE

Page 24: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Strict vs. Weak Example

The following fails under strict alignment but succeeds under weak alignment:

if (Ti.thisProc() == 0)fakeBarrier();

PROC == 0?

elsefakeBarrier();

update (id0) update (id1)

Ti.barrier(); fakeBarrier()fakeBarrier()

check ()STRICT failure!

24Amir Kamil LCPC'09, Newark, DE

Ti.barrier()STRICT

Page 25: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Strict vs. Weak Example

The following fails under strict alignment but succeeds under weak alignment:

h = save ()

if (Ti.thisProc() == 0)fakeBarrier();

PROC == 0?

elsefakeBarrier();

update (id0) update (id1)

Ti.barrier(); fakeBarrier()fakeBarrier()

restore(h)

check ()

restore(h)

WEAK success!

25

Ti.barrier()WEAK

Amir Kamil LCPC'09, Newark, DE

Page 26: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Strict or Weak Alignment?

Advantages of strict alignmentStrict alignment more closely matches current semanticsEasier to reason about whether or not an operation may have global effects than if it actually executes a collectivecollectiveStrict alignment performs fewer operations since it does not save and restore the hashdoes not save and restore the hash

Advantages of weak alignmentAccepts code that is rejected under weak alignment

26Amir Kamil LCPC'09, Newark, DE

Page 27: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Evaluation

Performance tested on two machinesEight-core (2x4) Intel Xeon E5435 2.33GHz SMPCluster of dual-processor 2.2GHz Opterons with InfiniBand interconnect

Three primitive collective operations testedBroadcast: one-to-all communicationBarrier: threads wait until all have reached itExchange: all-to-all communication

Three NAS Parallel Benchmarks testedConjugate gradient (CG)Fourier transform (FT)Fourier transform (FT)Multigrid (MG)

27Amir Kamil LCPC'09, Newark, DE

Page 28: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Enforcement Variants

Five enforcement variants testedstatic: use single type systemstrict: strict dynamic alignmentstrict/debug: strict dynamic alignment with alignment history savedweak: strict dynamic alignmentstrict/debug: weak dynamic alignment with alignment history saved

28Amir Kamil LCPC'09, Newark, DE

Page 29: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

SMP Collectives Results3.5

SMP Collectives Time2 4 8Processors

2.5

3

atic

2 4 8

2

tive to

 Sta

1

1.5

Time Re

lat

0.5

T

0

29

Broadcast Barrier ExchangeAmir Kamil LCPC'09, Newark, DE

Page 30: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Cluster Collectives Results

4.5

5Cluster Collectives Time Processors

3.5

4

4.5

atic

2 4 8 16 32

2.5

3

tive to

 Sta

1.5

2

Time Re

lat

0.5

1

T

0

30

Broadcast Barrier ExchangeAmir Kamil LCPC'09, Newark, DE

Page 31: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Cluster Application Results

1.2Cluster Applications Time

2 4 8 16 32Processors

0 8

1

atic

2 4 8 16 32

0.6

0.8

tive to

 Sta

0.4

Time Re

lat

0.2

T

0

31

CG FT MGAmir Kamil LCPC'09, Newark, DE

Page 32: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Performance Escape Hatches

Effects on application performance can be potentially reduced in multiple different ways

None used right now

OptimizationsRemove redundant checksRemove redundant checks

Hybrid static/dynamic analysisU t ti i f d th l h k thUse static inference and then only check those collectives that are not inferred to be aligned

T ff h ki i d tiTurn off checking in production runsUse checking only when debugging an application

32Amir Kamil LCPC'09, Newark, DE

Page 33: Enforcing Textual Alignment of Collectives using Dynamic ...akamil/papers/lcpc09talk.pdf · Synchronization analysis is critical to many analyses and optimizations ... Textual Collective

Conclusions

Dynamic checking removes annotation burden from programmersMinimal performance impact on applicationsMinimal performance impact on applications

Most applications avoid spending time in collectivesApplications that do spend a lot of time in collectivesApplications that do spend a lot of time in collectives don’t scale anyway

Multiple strategies to further reduce overheadMultiple strategies to further reduce overheadDynamic checking can be applied to l ith t t t t (languages without strong type systems (e.g. UPC)

33Amir Kamil LCPC'09, Newark, DE