17
MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu 2 , Stephen R. Beard 1 , Sotiris Apostolakis 1 , Ayal Zaks 3 and David I. August 1 PACT 2018 – Session 3a – November 1, 2018 1 Princeton University 2 Google India 3 Intel Corporation

MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization

Prakash Prabhu2, Stephen R. Beard1,Sotiris Apostolakis1, Ayal Zaks3 andDavid I. August1

PACT 2018 – Session 3a – November 1, 2018

1Princeton University 2Google India 3Intel Corporation

MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization

Prakash Prabhu2, Stephen R. Beard1,Sotiris Apostolakis1, Ayal Zaks3 andDavid I. August1

PACT 2018 – Session 3a – November 1, 2018

1Princeton University 2Google India 3Intel Corporation

Page 2: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Why Implicit Parallel Programming?

Explicit Parallel Programming

Automatic Parallelization

Implicit Parallel Programming

Ease of Sequential ProgrammingPerformance

PortabilityDomain

Knowledge ✗✗✗

2

Page 3: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Memoization• Definition– Store intermediate results and retrieve them when

needed again to avoid re-computation• Usage– Mainly used to speed up convergence in combinatorial

search and optimization problems • Challenge?– Accesses to memoization data structures impose

dependences that obstruct program parallelization• Insight– Programs continue to function correctly even with weakly

consistent memoization data structures• Result– New parallelism opportunities

3

Page 4: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Motivating example: SAT solver

-x1 -x7 -x8 -x9

x2 x3

c1 c2

x4

c3

x5

x6

-x5

c4

c5

c6

Conflict!

Learnt Clause Data Structurememoizes l1

4

l1 = (-x4∨x8 ∨x9)

x1 -x7 -x8 -x9

-x2

-x4

l1

c3

x3

Satisfied!

-x5

SAT propagationleads toconflict, learns a clause

Learntclauseenablespruning

x1, …, xn : Boolean variablesc1, …, cn : Problem Clausesl1, …, ln : Learnt Clauses

l1

TrueFalse

x1 V x2

Page 5: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

// SAT solver main loopwhile (1) {

// Propagationfor (int i=0; i < learnts.size(); ++i) {

l = learnts[i];…

}if ( Conflict) {

… // Learn Conflict clauselearnts.push(learnt_clause);… // Backtrack

} else { // Reduce set of learnt clausesif (n >= nof_learnts) {

for (int i=0; i < 2*n; ++i)learnts.remove(i);

}if (model)

return True;else

… // New variable decision}

}

5

Operations onlearnt clausedata structure

in the search loop

Page 6: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

(-x4 ∨x8 ∨x9)

(-x4 ∨x8∨-x6)

(-x3 ∨-x6)

(-x1 ∨-x10)

0

1

2

3

A: learnts.push((-x4 ∨x8 ∨x9))

B: learnts.push((-x4 ∨x8 ∨-x6))

C: learnts.push((-x3 ∨-x6))

D: learnts.push((-x1 ∨-x10))

learnts

E: learnts.remove(0)

F: for (i=0; i < learnts.size(); ++i ) { clause = learnts[i];…

}

6

Can execute out of order

Weak Deletion is safe

Partial viewis safe

Weaker Consistency Properties

Page 7: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Programming ModelProperties Specific Parallel Implementation

Out of Order

Partial View

Weak Deletion

Parallelization Driver Sharing

N-way [1] Runtime None

Semantic Commutativity[2, 3,4]

Compiler All

Galois [5] Runtime All

Concurrent Revisions [6], Cilk++ Hyperobjects [7]

Programmer None

ALTER [8] Compiler All

This work Compiler & Runtime Sparse

✗✗

Comparison of MemoDyn with Related Parallelization Frameworks

[1] Cledat et al., “Efficiently speeding up sequential computation through the n-way programming model”, OOPSLA 2011 [2] Bridges et al., “Revisiting the Sequential Programming Model for Multi-Core”, MICRO 2007[3] Prabhu et al., “Commutative Set: A Language Extension for Implicit Parallel Programming”, PLDI 2011[4] Vandierendonck et al., “The Paralax Infrastructure: Automatic Parallelization With a Helping Hand”, PACT 2010[5] Kulkarni et al., “Optimistic Parallelism requires abstractions”, PLDI 2007[6] Burckhardt et al., “Concurrent programming with revisions and isolation types”, OOPSLA 2010[7] Frigo et al., “Reducers and Other Cilk++ Hyperobjects”, SPAA 2009[8] Udupa et al., “ALTER: Exploiting breakable dependences for parallelization”, PLDI 2011

Page 8: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Concurrent implementations of weakly consistent data structures

• Complete Sharing

• Privatization

8

Prio

r Wor

kTh

is W

ork Hybrid that fully leverages

weakened semantics for optimizing the tradeoffbetween consistency and parallelism.

Collection of privatesequential data structuresExploits OoO, Partial View

Single shared data structureExploits OoO

• SparseSharing

AdaptiveStatic

Page 9: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

MemoDynFramework

9

Page 10: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Add MemoDynAnnotations

MemoDyn Frontend Passes & Runtime Specializer

Loop Profiler

Sequential Code

Implicitly Parallel CodePROGRAMMER

MEMODYN SRC ANALYSIS

PARALLELEXECUTION

MemoDynParallelizer

Linker

MEMODYNBACKEND

Main Loop

MemoDynRuntime

Worker i

Online Adaptation

Main Loop

MemoDynRuntime

Worker jShare Data

10

Page 11: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

#pragma MemoDyn(dstype, id)typed obj;

dstype := SET | MAP | ALLOCATOR

#pragma MemoDyn(id, itype)typer class::fn(t1 p1, …);

itype := INS | DEL | LU | ALLOC| SZ | DEALLOC | PROG | CMP

MemoDyn data structuredeclarations

MemoDyn Interfacedeclarations

class Solver { protected:

#pragma MemoDyn(SET, L)vec<Clause> learnts;

…void attachClause (Clause cr);

};

template<class T>class vec {

public:#pragma MemoDyn(L, SZ)

int size(void) const;#pragma MemoDyn(L, LU)

T& operator[](int index);#pragma MemoDyn(L, INS)

void push(const T&elem);#pragma MemoDyn(L, DEL)

void remove(int i);…

};

#pragma MemoDyn(L, PROG)double Solver::progressEstimate() const {

return (pow(F,i) * (end-beg)/nVars();}

11

Page 12: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Worker Thread i

MemoDyn Runtime

• Synchronization– Two phases

12

Shadowreplica

Native version

Worker Thread j

Shadowreplica

Native version

Phase I of synchronization

Phase II of synchronization

Priv

ate

Stat

e

Private State

Page 13: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

MemoDyn Runtime

• Tuning– Parameters:

1. Save native version to local shadow replica (Phase 1) period2. Copy remote shadow replica to native version (Phase 2) period3. Sharing set size

– Goal: Find the parameter configuration that maximizesthe progress made by each worker• Seeded by profiling• One tuning thread per worker• Gradient ascent based

13

Page 14: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Evaluation

14

Page 15: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Geomean speedup over 4 sequential programs

15

2 3 4 5 6 7 80

0.5

1

1.5

2

2.5

3

3.5

Num of Worker Threads

Pro

gra

m S

peedup

MemoDyn (adaptive)MemoDyn (non!adaptive)PrivatizationComplete!Sharing

Evaluated:• On 4 sequential programs

• Minisat (SAT solver)• Genetic algorithm based graph optimization• Partial maximum satisfiability solver • Alpha-beta search based game engine

• On dual-socket quad core machine • 4 parallelization schemes• MemoDyn adaptive• MemoDyn static• Privatization• Complete-sharing

Page 16: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Conclusion

• Implicit parallel programming framework for parallelizingsearch loops with weakly consistent data structures

• Includes:– Language extensions for expressing weak semantics of data

structures– Compiler-runtime system that leverages weak semantics for

automatic parallelization and adaptive runtime optimization. • Significant performance improvements over competing

techniques– Minimal programmer effort

16

Page 17: MemoDyn: Exploiting Weakly Consistent Data Structures for ......MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization Prakash Prabhu2, Stephen R. Beard1,

Thank you

17