292
January 20, 2015 1 / 121 Polyhedral Compilation without Polyhedra Sven Verdoolaege INRIA and KU Leuven [email protected] January 20, 2015

Polyhedral Compilation without Polyhedra - Inria

  • Upload
    dothuan

  • View
    235

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Polyhedral Compilation without Polyhedra - Inria

January 20, 2015 1 / 121

Polyhedral Compilation without Polyhedra

Sven Verdoolaege

INRIA and KU [email protected]

January 20, 2015

Page 2: Polyhedral Compilation without Polyhedra - Inria

January 20, 2015 2 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

8 Tools

Page 3: Polyhedral Compilation without Polyhedra - Inria

January 20, 2015 3 / 121

Part I

Polyhedral Compilation

Page 4: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model January 20, 2015 4 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 5: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 5 / 121

Polyhedral Compilation

Polyhedral CompilationAnalyzing and/or transforming loop programs usingthe polyhedral model

Polyhedral ModelAbstract representation of a loop program

instance based⇒ statement instances⇒ array elements

compact representation based on polyhedra or similar objects⇒ integer points in unions of parametric polyhedra⇒ Presburger sets and relations

parametric⇒ description may depend on symbolic constants

Note: naming is “historical”

polyhedral compilation does not require polyhedra (e.g., omega(+))

other approaches also use polyhedra (e.g., abstract interpretation)

[8, 10, 14, 16]

Page 6: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 5 / 121

Polyhedral Compilation

Polyhedral CompilationAnalyzing and/or transforming loop programs usingthe polyhedral model

Polyhedral ModelAbstract representation of a loop program

instance based⇒ statement instances⇒ array elements

compact representation based on polyhedra or similar objects⇒ integer points in unions of parametric polyhedra⇒ Presburger sets and relations

parametric⇒ description may depend on symbolic constants

Note: naming is “historical”

polyhedral compilation does not require polyhedra (e.g., omega(+))

other approaches also use polyhedra (e.g., abstract interpretation)

[8, 10, 14, 16]

Page 7: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 6 / 121

Polyhedral Model

Main constituents of program representation

Instance Set

⇒ the set of all statement instances

Access Relations

⇒ the array elements accessed by a statement instance

Dependences

⇒ the statement instances that depend on a statement instance

Schedule

⇒ the relative execution order of statement instances

Page 8: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 7 / 121

Polyhedral Model Requirements

Requirements for basic polyhedral model: SANA input

Static control⇒ control does not depend on input data

Affine⇒ all relevant expressions are (quasi-)affine

No Aliasing⇒ essentially no pointer manipulations

Note:

polyhedral model may be approximation of input that does notstrictly satisfy all requirements

many extensions are availablea small selection of these extensions will be discussed in this tutorial

Page 9: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 7 / 121

Polyhedral Model Requirements

Requirements for basic polyhedral model: SANA input

Static control⇒ control does not depend on input data

Affine⇒ all relevant expressions are (quasi-)affine

No Aliasing⇒ essentially no pointer manipulations

Note:

polyhedral model may be approximation of input that does notstrictly satisfy all requirements

many extensions are availablea small selection of these extensions will be discussed in this tutorial

Page 10: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(0, 0); S(0, 1); S(1, 0); S(1, 1); T(0); T(1) }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }

Constraints

“read off”, or

obtained through abstract interpretation

{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 11: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(0, 0); S(0, 1); S(1, 0); S(1, 1); T(0); T(1) }

Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }

Constraints

“read off”, or

obtained through abstract interpretation

{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 12: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(0, 0); S(0, 1); S(1, 0); S(1, 1); T(0); T(1) }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }

Constraints

“read off”, or

obtained through abstract interpretation

{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 13: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(0, 0); S(0, 1); S(1, 0); S(1, 1); T(0); T(1) }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }

Constraints

“read off”, or

obtained through abstract interpretation

{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 14: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(0, 0); S(0, 1); S(1, 0); S(1, 1); T(0); T(1) }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }

Constraints

“read off”, or

obtained through abstract interpretation{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 15: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(0, 0); S(0, 1); S(1, 0); S(1, 1); T(0); T(1) }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }Constraints

“read off”, or

obtained through abstract interpretation

{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 16: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }Constraints

“read off”, or

obtained through abstract interpretation{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 17: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }Access Relations (accessed array elements; W : write, R: read)

W = { S(0, 0)→ A(0); S(0, 1)→ A(1); S(1, 0)→ A(1);

S(1, 1)→ A(2) }R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }

Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }Constraints

“read off”, or

obtained through abstract interpretation

{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }

{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 18: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }Access Relations (accessed array elements; W : write, R: read)

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }

R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }Constraints

“read off”, or

obtained through abstract interpretation{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 19: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }Access Relations (accessed array elements; W : write, R: read)

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }

R = { R()→ A(2); T(0)→ A(0); T(1)→ A(1); T(1)→ A(0) }Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }Constraints

“read off”, or

obtained through abstract interpretation{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }

{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 20: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 8 / 121

Illustrative ExampleR: h(A[2]);

for (int i = 0; i < 2; ++i)

for (int j = 0; j < 2; ++j)

S: A[i + j] = f(i, j);

for (int k = 0; k < 2; ++k)

T: g(A[k], A[0]);

instance based

compactrepresentation

Instance Set (set of statement instances)

I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }Access Relations (accessed array elements; W : write, R: read)

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }

R = { R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }Schedule (relative execution order)

R(), S(0, 0), S(0, 1), S(1, 0), S(1, 1), T(0), T(1)

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2; }Constraints

“read off”, or

obtained through abstract interpretation{ S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

Page 21: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Introduction January 20, 2015 9 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

Instance Set (set of statement instances)

{ S1(i , j) : 0 ≤ i < M ∧ 0 ≤ j < N;

S2(i , j , k) : 0 ≤ i < M ∧ 0 ≤ j < N ∧ 0 ≤ k < K }

Access Relations (accessed array elements; W : write, R: read)

W = { S1(i , j)→ C(i , j); S2(i , j , k)→ C(i , j) }R = { S2(i , j , k)→ C(i , j); S2(i , j , k)→ A(i , k); S2(i , j , k)→ B(k , j) }

Schedule (relative execution order)

S1(0, 0), S2(0, 0, 0), S2(0, 0, 1), . . . , S1(0, 1), S2(0, 1, 0), S2(0, 1, 1), . . . ,

Page 22: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 10 / 121

Named Presburger Sets and Relations [20]

Examples

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

General form

Sets{S1(i) : f1(i); S2(i) : f2(i); . . . },

with fk Presburger formulas

⇒ set of elements of the form S1(i), one for each i satisfying f1(i), . . .

Binary relations

{ S1(i)→ T1(j) : f1(i, j); S2(i)→ T2(j) : f2(i, j); . . . }

⇒ set of pairs of elements of the form S1(i)→ T1(j)

Note: despite “→”, not necessarily (single valued) functions

Page 23: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 10 / 121

Named Presburger Sets and Relations [20]

Examples

{ R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }{ R()→ A(2); T(k)→ A(0) : 0 ≤ k < 2; T(k)→ A(k) : 0 ≤ k < 2 }

General form

Sets{S1(i) : f1(i); S2(i) : f2(i); . . . },

with fk Presburger formulas

⇒ set of elements of the form S1(i), one for each i satisfying f1(i), . . .

Binary relations

{ S1(i)→ T1(j) : f1(i, j); S2(i)→ T2(j) : f2(i, j); . . . }

⇒ set of pairs of elements of the form S1(i)→ T1(j)

Note: despite “→”, not necessarily (single valued) functions

Page 24: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 11 / 121

Named Presburger Sets and Relations [20]

General form

Sets{S1(i) : f1(i); S2(i) : f2(i); . . . },

where fk(i) are Presburger formulas with i as only free variables⇒ set of elements of the form S1(i), one for each i satisfying f1(i), . . .

Note: may depend on interpretation of symbolic constants

{S(i) : 0 ≤ i ≤ n() }

is equal to

∅ if n < 0

{S(0) } if n = 0

{S(0);S(1) } if n = 1

{S(0);S(1);S(2) } if n = 2

. . .

Page 25: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 11 / 121

Named Presburger Sets and Relations [20]

General form

Sets{S1(i) : f1(i); S2(i) : f2(i); . . . },

where fk(i) are Presburger formulas with i as only free variables⇒ set of elements of the form S1(i), one for each i satisfying f1(i), . . .

Note: may depend on interpretation of symbolic constants

{ S(i) : 0 ≤ i ≤ n() }

is equal to

∅ if n < 0

{ S(0) } if n = 0

{ S(0);S(1) } if n = 1

{ S(0);S(1);S(2) } if n = 2

. . .

Page 26: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 12 / 121

Quasi-Affine Expressions and Presburger Formulae

Symbolic ConstantI has unknown but fixed valueI typically used to represent size parameter

Quasi-Affine Expression

; Presburger Term

I variableI symbolic constantI integer constantI addition (+), subtraction (−)I integer division by a constant (b·/dc)

Presburger FormulaI trueI equality on terms (=)I less than or equal on terms (≤)I logical connectives (∧, ∨, ¬)I quantification (∃, ∀)

Page 27: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 12 / 121

Quasi-Affine Expressions and Presburger Formulae

Symbolic ConstantI has unknown but fixed valueI typically used to represent size parameter

Quasi-Affine Expression

; Presburger Term

I variableI symbolic constantI integer constantI addition (+), subtraction (−)I integer division by a constant (b·/dc)

Presburger FormulaI trueI equality on terms (=)I less than or equal on terms (≤)I logical connectives (∧, ∨, ¬)I quantification (∃, ∀)

Page 28: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 12 / 121

Quasi-Affine Expressions and Presburger Formulae

Symbolic ConstantI has unknown but fixed valueI typically used to represent size parameter

Quasi-Affine Expression ; Presburger TermI variableI symbolic constantI integer constantI addition (+), subtraction (−)I integer division by a constant (b·/dc)

Presburger FormulaI trueI equality on terms (=)I less than or equal on terms (≤)I logical connectives (∧, ∨, ¬)I quantification (∃, ∀)

Page 29: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬true

a⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to {S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 30: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to {S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 31: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }

{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 32: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }

n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 33: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 34: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 35: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 36: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to {S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 37: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to { S(i , j) : i ≥ 0 ∧ j ≥ 0 }

a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to {S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 38: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 13 / 121

Syntactic Sugar

false is equal to ¬truea⇒ b is equal to ¬a ∨ b

{ S(i) } is equal to { S(i) : true }{ S(i1, . . . in−1, g(i1, . . . , in−1), in+1, . . .) : f (i) } is equal to{ S(i1, . . . in−1, in, in+1, . . .) : f (i) ∧ in = g(i1, . . . , in−1) }Example: {S(i)→ S(i + 1) } is equal to {S(i)→ S(j) : j = i + 1 }n is equal to n()

a < b is equal to a ≤ b − 1

a ≥ b is equal to b ≤ a

a > b is equal to a ≥ b + 1

a, b ⊕ c is equal to a⊕ c ∧ b ⊕ c with ⊕ ∈ {≤, <,≥, >,= }Example: {S(i , j) : i , j ≥ 0 } is equal to { S(i , j) : i ≥ 0 ∧ j ≥ 0 }a⊕1 b ⊕2 c is equal to a⊕1 b ∧ b ⊕2 c with{⊕1,⊕2 } ⊂ {≤, <,≥, >,= }Example: {S(i) : 0 ≤ i ≤ 10 } is equal to { S(i) : 0 ≤ i ∧ i ≤ 10 }

Page 39: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− e

n · e is equal to e + e + · · ·+ e︸ ︷︷ ︸n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 40: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− en · e is equal to e + e + · · ·+ e︸ ︷︷ ︸

n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 41: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− en · e is equal to e + e + · · ·+ e︸ ︷︷ ︸

n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }

a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 42: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− en · e is equal to e + e + · · ·+ e︸ ︷︷ ︸

n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 43: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− en · e is equal to e + e + · · ·+ e︸ ︷︷ ︸

n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 44: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− en · e is equal to e + e + · · ·+ e︸ ︷︷ ︸

n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 45: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 14 / 121

Syntactic Sugar (2)

−e is equal to 0− en · e is equal to e + e + · · ·+ e︸ ︷︷ ︸

n times

(with n a non-negative integer constant)

a1, . . . , an ≺ b1, . . . , bn is equal to∨n

i=1

((∧i−1j=1 aj = bj

)∧ ai < bi

)Example: {S(i1, i2)→ S(j1, j2) : i1, i2 ≺ j1, j2 } is equal to{ S(i1, i2)→ S(j1, j2) : i1 < j1 ∨ (i1 = j1 ∧ i2 < j2) }a1, . . . , an 4 b1, . . . , bn is equal toa1, . . . , an ≺ b1, . . . , bn ∨ a1, . . . , an = b1, . . . , bn

a1, . . . , an � b1, . . . , bn is equal to b1, . . . , bn ≺ a1, . . . , an

a1, . . . , an < b1, . . . , bn is equal to b1, . . . , bn 4 a1, . . . , an

a mod n is equal to a− n ba/nc

Page 46: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 15 / 121

Spaces

Recall general form

Sets{S1(i) : f1(i); S2(i) : f2(i); . . . },

Binary relations

{ S1(i)→ T1(j) : f1(i, j); S2(i)→ T2(j) : f2(i, j); . . . }

The identifier (e.g., S1, S2, T1, T2), together withthe dimension, i.e., number of elements in subsequent tuple (e.g., i, j),will be called a space

When we say S2(i) = T1(j), we mean

the identifiers S2 and T1 are the same

the dimensions of i and j are the same

Examples: S() 6= S(i), S(a) = S(b), S() 6= T()

Page 47: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 16 / 121

Space DecompositionGeneral form can be rewritten

{ S1(i)→ T1(j) : f1(i, j); S2(i)→ T2(j) : f2(i, j); . . . }

=⋃k

{Sk(i)→ Tk(j) : fk(i, j) }

some operations distribute with unionother operations are defined on a single (pair of) space(s)⇒ “space local” operations⇒ replace { S(i)→ T (j) : f1(i, j); S(i)→ T (j) : f2(i, j) }

by { S(i)→ T (j) : f1(i, j) ∨ f2(i, j) }

In both cases, we define

unary operator ⊕⊕⋃i

Ri :=⋃i

⊕Ri

binary operator ⊕(⋃i

Ri

)⊕(⋃

j

Sj

):=⋃i

⋃j

(Ri ⊕ Sj)

Page 48: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 17 / 121

Basic Operations

A = {S(i1)→ T (j1) : f (i1, j1) } B = {U(i2)→ V (j2) : g(i2, j2) }

Union

A ∪ B = { S(i)→ T (j) : f (i, j); U(i)→ V (j) : g(i, j) }Intersection

A ∩ B ={{ S(i)→ T (j) : f (i, j) ∧ g(i, j) } if S(i1) = U(i2) and T (j1) = V (j2)

∅ otherwise

Difference

A \ B ={{ S(i)→ T (j) : f (i, j) ∧ ¬g(i, j) } if S(i1) = U(i2) and T (j1) = V (j2)

{ S(i)→ T (j) : f (i, j) } otherwise

Page 49: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 18 / 121

Emptiness Check and Comparisons

Emptiness checkDoes a set or binary relation contain any elements?(for any value of the symbolic constants)

Is

{ (a, b, c , d) : 3d ≥ −21 + 19a− 11b − 6c ∧ 3d ≤ 21 + 17a− b − 6c ∧2b ≤ −15 + a ∧ 3d ≤ 2 + a + b ∧ 3d ≥ a + b }

empty?

⇒ No, contains (13,−1, 38, 4) (and infinitely many other elements)

ComparisonsI A ⊆ B is defined as A \ B = ∅I A ⊇ B is defined as B ⊆ AI A = B is defined as A ⊆ B ∧ A ⊇ BI A ⊂ B is defined as A ⊆ B ∧ ¬(A = B)I A ⊃ B is defined as B ⊂ A

Page 50: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 18 / 121

Emptiness Check and Comparisons

Emptiness checkDoes a set or binary relation contain any elements?(for any value of the symbolic constants)Is

{ (a, b, c , d) : 3d ≥ −21 + 19a− 11b − 6c ∧ 3d ≤ 21 + 17a− b − 6c ∧2b ≤ −15 + a ∧ 3d ≤ 2 + a + b ∧ 3d ≥ a + b }

empty?

⇒ No, contains (13,−1, 38, 4) (and infinitely many other elements)

ComparisonsI A ⊆ B is defined as A \ B = ∅I A ⊇ B is defined as B ⊆ AI A = B is defined as A ⊆ B ∧ A ⊇ BI A ⊂ B is defined as A ⊆ B ∧ ¬(A = B)I A ⊃ B is defined as B ⊂ A

Page 51: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 18 / 121

Emptiness Check and Comparisons

Emptiness checkDoes a set or binary relation contain any elements?(for any value of the symbolic constants)Is

{ (a, b, c , d) : 3d ≥ −21 + 19a− 11b − 6c ∧ 3d ≤ 21 + 17a− b − 6c ∧2b ≤ −15 + a ∧ 3d ≤ 2 + a + b ∧ 3d ≥ a + b }

empty?

⇒ No, contains (13,−1, 38, 4) (and infinitely many other elements)

ComparisonsI A ⊆ B is defined as A \ B = ∅I A ⊇ B is defined as B ⊆ AI A = B is defined as A ⊆ B ∧ A ⊇ BI A ⊂ B is defined as A ⊆ B ∧ ¬(A = B)I A ⊃ B is defined as B ⊂ A

Page 52: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 18 / 121

Emptiness Check and Comparisons

Emptiness checkDoes a set or binary relation contain any elements?(for any value of the symbolic constants)Is

{ (a, b, c , d) : 3d ≥ −21 + 19a− 11b − 6c ∧ 3d ≤ 21 + 17a− b − 6c ∧2b ≤ −15 + a ∧ 3d ≤ 2 + a + b ∧ 3d ≥ a + b }

empty?

⇒ No, contains (13,−1, 38, 4) (and infinitely many other elements)

ComparisonsI A ⊆ B is defined as A \ B = ∅I A ⊇ B is defined as B ⊆ AI A = B is defined as A ⊆ B ∧ A ⊇ BI A ⊂ B is defined as A ⊆ B ∧ ¬(A = B)I A ⊃ B is defined as B ⊂ A

Page 53: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 19 / 121

CardinalityCardinality of a set

⇒ number of elements in the set⇒ may depend on symbolic constants

S = {S(i) : f (i) }

cardS = { n : n = #i : f (i) } card

(⋃i

Si

):=∑i

cardSi

card { A(i) : 0 ≤ i ≤ n; B() } = n + 2

Cardinality of a binary relation

⇒ for each domain element, number of corresponding images

R = {S(i)→ T (j) : f (i, j) }

cardR = { S(i)→ n : n = #j : f (i, j) } card

(⋃i

Ri

):=∑i

cardRi

R = { A(i)→ C(i) : 0 ≤ i ≤ n; B()→ C(i) : 0 ≤ i ≤ n }cardR = { A(i)→ 1 : 0 ≤ i ≤ n; B()→ n + 1 }

⇒ not a Presburger formula

Page 54: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 19 / 121

CardinalityCardinality of a set

⇒ number of elements in the set⇒ may depend on symbolic constants

S = {S(i) : f (i) }

cardS = { n : n = #i : f (i) } card

(⋃i

Si

):=∑i

cardSi

card { A(i) : 0 ≤ i ≤ n; B() } = n + 2

Cardinality of a binary relation

⇒ for each domain element, number of corresponding images

R = { S(i)→ T (j) : f (i, j) }

cardR = { S(i)→ n : n = #j : f (i, j) } card

(⋃i

Ri

):=∑i

cardRi

R = { A(i)→ C(i) : 0 ≤ i ≤ n; B()→ C(i) : 0 ≤ i ≤ n }cardR = { A(i)→ 1 : 0 ≤ i ≤ n; B()→ n + 1 }

⇒ not a Presburger formula

Page 55: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 19 / 121

CardinalityCardinality of a set

⇒ number of elements in the set⇒ may depend on symbolic constants

S = {S(i) : f (i) }

cardS = { n : n = #i : f (i) } card

(⋃i

Si

):=∑i

cardSi

card { A(i) : 0 ≤ i ≤ n; B() } = n + 2

Cardinality of a binary relation

⇒ for each domain element, number of corresponding images

R = { S(i)→ T (j) : f (i, j) }

cardR = { S(i)→ n : n = #j : f (i, j) } card

(⋃i

Ri

):=∑i

cardRi

R = { A(i)→ C(i) : 0 ≤ i ≤ n; B()→ C(i) : 0 ≤ i ≤ n }cardR = { A(i)→ 1 : 0 ≤ i ≤ n; B()→ n + 1 }

⇒ not a Presburger formula

Page 56: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 20 / 121

isl and Related Libraries and Tools

LLVM imath GMP

clang isl NTL PolyLib

Polly pet barvinok

PPCG isa iscc

Licenses:BSD/MITLGPLGPL

isl: manipulates parametric affine sets and relationsbarvinok: counts elements in parametric affine sets and relationspet: extracts polyhedral model from clang ASTPPCG: Polyhedral Parallel Code Generatoriscc: interactive calculatorisa: prototype tool set including derivation of process networks andequivalence checker

Page 57: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 21 / 121

isl/iscc syntax

Relation description

() (tuple) []

+, − +, -=, ≤, <, ≥, > =, <=, <, >=, >true true

false false

∧ and

∨ or

¬ not

∃v : exists v :

∀v : not exists v : not

4, ≺, <, � (not available yet;write out explicitly)

Operations on relations

∪ +

∩ *

\ -

= =

⊂,⊆,⊃,⊇ <, <=, >, >=card card

Note: symbolic constants need to be explicitly declared

Page 58: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 22 / 121

Parametric Example: Matrix Multiplication

for (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

Number of statement instances

card { S1(i , j) : 0 ≤ i < M ∧ 0 ≤ j < N;

S2(i , j , k) : 0 ≤ i < M ∧ 0 ≤ j < N ∧ 0 ≤ k < K }

Number of array elements accessed by each instance

card { S1(i , j)→ C(i , j); S2(i , j , k)→ C(i , j);

S2(i , j , k)→ C(i , j); S2(i , j , k)→ A(i , k); S2(i , j , k)→ B(k , j) }

Page 59: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 22 / 121

Parametric Example: Matrix Multiplication

for (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

Number of statement instances

card { S1(i , j) : 0 ≤ i < M ∧ 0 ≤ j < N;

S2(i , j , k) : 0 ≤ i < M ∧ 0 ≤ j < N ∧ 0 ≤ k < K }

Number of array elements accessed by each instance

card { S1(i , j)→ C(i , j); S2(i , j , k)→ C(i , j);

S2(i , j , k)→ C(i , j); S2(i , j , k)→ A(i , k); S2(i , j , k)→ B(k , j) }

Page 60: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 23 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

How many statement instances are executed by f2?

m>n ? m*n-n+m : m*n

How many array elements accessed by each instance?

S[i]->1;T[i,j]->1+j

Page 61: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 23 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

How many statement instances are executed by f2?

m>n ? m*n-n+m : m*n

How many array elements accessed by each instance?

S[i]->1;T[i,j]->1+j

Page 62: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 23 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

How many statement instances are executed by f2? m>n ? m*n-n+m : m*n

How many array elements accessed by each instance?

S[i]->1;T[i,j]->1+j

Page 63: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Model Representation January 20, 2015 23 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

How many statement instances are executed by f2? m>n ? m*n-n+m : m*n

How many array elements accessed by each instance? S[i]->1;T[i,j]->1+j

Page 64: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation January 20, 2015 24 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 65: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 25 / 121

Polyhedral Transformation

Two approaches1 encode execution order in statement instance indices

⇒ transformation performed by manipulating instance set

2 keep track of execution order separately: schedule

⇒ transformation performed by manipulating initial/current schedule, or⇒ transformation performed by constructing new schedule from scratch

Schedule O keeps track of relative execution order of statement instances

⇒ for each pair of statement instances S(i) and T (j),schedule determines

I S(i) executed before T (j) O(S(i),T (j)) < 0I S(i) executed after T (j) O(S(i),T (j)) > 0I S(i) and T (j) may be executed simultaneously O(S(i),T (j)) = 0

Page 66: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 25 / 121

Polyhedral Transformation

Two approaches1 encode execution order in statement instance indices

⇒ transformation performed by manipulating instance set

2 keep track of execution order separately: schedule

⇒ transformation performed by manipulating initial/current schedule, or⇒ transformation performed by constructing new schedule from scratch

Schedule O keeps track of relative execution order of statement instances

⇒ for each pair of statement instances S(i) and T (j),schedule determines

I S(i) executed before T (j) O(S(i),T (j)) < 0I S(i) executed after T (j) O(S(i),T (j)) > 0I S(i) and T (j) may be executed simultaneously O(S(i),T (j)) = 0

Page 67: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 26 / 121

Schedule Representations

Types of schedule representations

Combined representationsI schedule treeI named Presburger relation

Scattered representationsI Kelly’s abstractionI “2d + 1”

Schedule Trees

Main node typesI sequence: children are executed in orderI band: instance are executed according to associated

piecewise quasi-affine partial schedule P

Deriving schedule tree from ASTI for loop ⇒ single-dimensional bandI compound statement ⇒ sequence

[11, 13, 20, 24]

Page 68: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 26 / 121

Schedule Representations

Types of schedule representations

Combined representationsI schedule treeI named Presburger relation

Scattered representationsI Kelly’s abstractionI “2d + 1”

Schedule Trees

Main node typesI sequence: children are executed in orderI band: instance are executed according to associated

piecewise quasi-affine partial schedule P

Deriving schedule tree from ASTI for loop ⇒ single-dimensional bandI compound statement ⇒ sequence

[11, 13, 20, 24]

Page 69: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 27 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 70: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 27 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 71: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 27 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 72: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 27 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 73: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 27 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 74: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 28 / 121

Schedule Trees — Execution OrderWhat is the execution order of statement instances S(i) and T (j) determined byschedule tree O?

Start at root of schedule treewhile current node is not a leaf do

if current node is sequence thenif S(i) and T (j) appear in same child then

Move to common childelse if S(i) appears in earlier child then

return O(S(i),T (j)) < 0else

return O(S(i),T (j)) > 0else

if P(S(i)) = P(T (j)) thenMove to single child

else if P(S(i)) ≺ P(T (j)) thenreturn O(S(i),T (j)) < 0

elsereturn O(S(i),T (j)) > 0

return O(S(i),T (j)) = 0

Page 75: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 29 / 121

Named Presburger Relation Schedules

Schedule tree with single (band) node

Flattening a schedule tree

two nested band nodes

⇒ replace by single band node with concatenated partial schedule

sequence with as children either leaves ortrees consisting of a single band node

⇒ treat leaves as zero-dimensional band nodes⇒ pad lower-dimensional bands (e.g., with zero)⇒ construct one-dimensional band assigning increasing value to children⇒ combine one-dimensional band with children

Page 76: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 29 / 121

Named Presburger Relation Schedules

Schedule tree with single (band) node

Flattening a schedule tree

two nested band nodes

⇒ replace by single band node with concatenated partial schedule

sequence with as children either leaves ortrees consisting of a single band node

⇒ treat leaves as zero-dimensional band nodes⇒ pad lower-dimensional bands (e.g., with zero)⇒ construct one-dimensional band assigning increasing value to children⇒ combine one-dimensional band with children

Page 77: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 30 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 78: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 30 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

sequence

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 79: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 30 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j); S2(i , j , k)→ (j)

S1(i , j)→ (0, 0); S2(i , j , k)→ (1, k)

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 80: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 30 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i); S2(i , j , k)→ (i)

S1(i , j)→ (j , 0, 0); S2(i , j , k)→ (j , 1, k)

S1(i , j)→ (0, 0); S2(i , j , k)→ (1, k)

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 81: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation Schedules January 20, 2015 30 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

S1(i , j)→ (i , j , 0, 0); S2(i , j , k)→ (i , j , 1, k)

S1(i , j)→ (j , 0, 0); S2(i , j , k)→ (j , 1, k)

S1(i , j)→ (0, 0); S2(i , j , k)→ (1, k)

S1(i , j)

S1(i , j)→ (0)

S2(i , j , k)

S2(i , j , k)→ (k)

Page 82: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 31 / 121

Domain and Range of a Relation

R = { S(i)→ T (j) : f (i, j) }

Domain (iscc: dom)

domR = { S(i) : ∃j : f (i, j) }

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }domW = { S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }⇒ statement instances writing any array element

Range (iscc: ran)

ranR = {T (j) : ∃i : f (i, j) }

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }ranW = { A(a) : 0 ≤ a ≤ 2 }⇒ written array elements

Page 83: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 31 / 121

Domain and Range of a Relation

R = { S(i)→ T (j) : f (i, j) }

Domain (iscc: dom)

domR = { S(i) : ∃j : f (i, j) }

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }domW = { S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }⇒ statement instances writing any array element

Range (iscc: ran)

ranR = {T (j) : ∃i : f (i, j) }

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }ranW = { A(a) : 0 ≤ a ≤ 2 }⇒ written array elements

Page 84: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 32 / 121

Domain/Range Restriction

A = {S1(i1) : f (i1) } B = {T1(j1) : g(j1) }C = {S2(i2)→ T2(j2) : h(i2, j2) }

Product relation (iscc: ->)

A→ B = { S1(i)→ T1(j) : f (i) ∧ g(j) }

Domain restriction (iscc: *)

R ∩dom S = R ∩ (S → (ranR))

C ∩dom A =

{{S2(i)→ T2(j) : f (i) ∧ h(i, j) } if S1(i1) = S2(i2)

∅ otherwise

Range restriction

R ∩ran S = R ∩ ((domR)→ S)

C ∩ran A =

{{ S2(i)→ T2(j) : f (j) ∧ h(i, j) } if S1(i1) = T2(j2)

∅ otherwise

Page 85: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 32 / 121

Domain/Range Restriction

A = {S1(i1) : f (i1) } B = {T1(j1) : g(j1) }C = {S2(i2)→ T2(j2) : h(i2, j2) }

Product relation (iscc: ->)

A→ B = { S1(i)→ T1(j) : f (i) ∧ g(j) }Domain restriction (iscc: *)

R ∩dom S = R ∩ (S → (ranR))

C ∩dom A =

{{ S2(i)→ T2(j) : f (i) ∧ h(i, j) } if S1(i1) = S2(i2)

∅ otherwise

Range restriction

R ∩ran S = R ∩ ((domR)→ S)

C ∩ran A =

{{ S2(i)→ T2(j) : f (j) ∧ h(i, j) } if S1(i1) = T2(j2)

∅ otherwise

Page 86: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 32 / 121

Domain/Range Restriction

A = {S1(i1) : f (i1) } B = {T1(j1) : g(j1) }C = {S2(i2)→ T2(j2) : h(i2, j2) }

Product relation (iscc: ->)

A→ B = { S1(i)→ T1(j) : f (i) ∧ g(j) }Domain restriction (iscc: *)

R ∩dom S = R ∩ (S → (ranR))

C ∩dom A =

{{ S2(i)→ T2(j) : f (i) ∧ h(i, j) } if S1(i1) = S2(i2)

∅ otherwise

Range restriction

R ∩ran S = R ∩ ((domR)→ S)

C ∩ran A =

{{ S2(i)→ T2(j) : f (j) ∧ h(i, j) } if S1(i1) = T2(j2)

∅ otherwise

Page 87: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 33 / 121

AST Generation [5, 6]

Input: instance setschedule

Output: AST that visits each domain element according to theorder specified by the schedule

Note: in case of flat schedule, schedule order is lexicographic order ofoutput space

⇒ single output space

iscc codegen operation takes as input flat schedule with instance setencoded in domain

⇒ apply * to “pure” schedule and instance set first

Page 88: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 34 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

I := [M,N,K] -> { S1[i,j] : 0 <= i < M and 0 <= j < N;

S2[i,j,k] : 0 <= i < M and 0 <= j < N and 0 <= k < K };

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

codegen (O * I);

for (int c0 = 0; c0 < M; c0 += 1)

for (int c1 = 0; c1 < N; c1 += 1) {

S1(c0, c1);

for (int c3 = 0; c3 < K; c3 += 1)

S2(c0, c1, c3);

}

Page 89: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 34 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

I := [M,N,K] -> { S1[i,j] : 0 <= i < M and 0 <= j < N;

S2[i,j,k] : 0 <= i < M and 0 <= j < N and 0 <= k < K };

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

codegen (O * I);

for (int c0 = 0; c0 < M; c0 += 1)

for (int c1 = 0; c1 < N; c1 += 1) {

S1(c0, c1);

for (int c3 = 0; c3 < K; c3 += 1)

S2(c0, c1, c3);

}

Page 90: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 34 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

I := [M,N,K] -> { S1[i,j] : 0 <= i < M and 0 <= j < N;

S2[i,j,k] : 0 <= i < M and 0 <= j < N and 0 <= k < K };

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

codegen (O * I);

for (int c0 = 0; c0 < M; c0 += 1)

for (int c1 = 0; c1 < N; c1 += 1) {

S1(c0, c1);

for (int c3 = 0; c3 < K; c3 += 1)

S2(c0, c1, c3);

}

Page 91: Polyhedral Compilation without Polyhedra - Inria

Polyhedral Transformation AST Generation January 20, 2015 35 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

Write down schedule and generate AST codegen (O * I)

Page 92: Polyhedral Compilation without Polyhedra - Inria

Dependences January 20, 2015 36 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 93: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 94: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 95: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 96: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 97: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 98: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 99: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 100: Polyhedral Compilation without Polyhedra - Inria

Dependences Schedule Validity January 20, 2015 37 / 121

Schedule Validity [1]

Not all schedules correspond to a valid execution order

R(a) W(a) R(a) W(b) W(a) W(a)

Internal restrictions

A read of a value should not be scheduled after the write of the valueNo other write to same memory location may be scheduled in between

External restrictions (on non-temporaries)

No write may be scheduled before initial read from a memory locationNo write may be scheduled after last write to a memory location

Sufficient conditions:

Every read of a memory location is scheduled after every previouswrite to the same memory locationEvery write to a memory location is scheduled after every previousread or write to the same memory location

Page 101: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 38 / 121

Dependences

Sufficient conditions for schedule validity:

Every read of a memory location is scheduled after every previouswrite to the same memory location

Every write to a memory location is scheduled after every previousread or write to the same memory location

Dependence relation D: pairs of statement instances

accessing the same memory location

of which at least one is a write

with the first executed before the second

Sufficient condition:

∀S(i)→ T (j) ∈ D : O(S(i),T (j)) < 0

Page 102: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 39 / 121

Inverse Relation and Composition

A = { S1(i1)→ T1(j1) : f (i1, j1) } B = { S2(i2)→ T2(j2) : g(i2, j2) }

Inverse (iscc: ^-1)

A−1 = {T1(j)→ S1(i) : f (i, j) }

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }W−1 = { A(a)→ S(i , j) : a = i + j ∧ 0 ≤ i < 2 ∧ 0 ≤ j < 2 }⇒ statement instances writing array element

Composition (iscc: after)

B ◦ A =

{{S1(i)→ T2(j) : ∃k : f (i, k) ∧ g(k, j) } if T1(j1) = S2(i2)

∅ otherwise

W−1 ◦W = { S(i , j)→ S(i ′, j ′) : 0 ≤ i , i ′, j , j ′ < 2 ∧ i + j = i ′ + j ′ }⇒ pairs of statement instances that write same array element

Page 103: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 39 / 121

Inverse Relation and Composition

A = { S1(i1)→ T1(j1) : f (i1, j1) } B = { S2(i2)→ T2(j2) : g(i2, j2) }

Inverse (iscc: ^-1)

A−1 = {T1(j)→ S1(i) : f (i, j) }

W = { S(i , j)→ A(i + j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2 }W−1 = { A(a)→ S(i , j) : a = i + j ∧ 0 ≤ i < 2 ∧ 0 ≤ j < 2 }⇒ statement instances writing array element

Composition (iscc: after)

B ◦ A =

{{S1(i)→ T2(j) : ∃k : f (i, k) ∧ g(k, j) } if T1(j1) = S2(i2)

∅ otherwise

W−1 ◦W = { S(i , j)→ S(i ′, j ′) : 0 ≤ i , i ′, j , j ′ < 2 ∧ i + j = i ′ + j ′ }⇒ pairs of statement instances that write same array element

Page 104: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 40 / 121

Lexicographic Order

Sets (iscc: <<)

A = { S(i) : f (i) }B = {T (j) : g(j) }

A ≺ B =

{{ S(i)→ S(j) : f (i) ∧ g(j) ∧ i ≺ j } if S(i) = T (j)

∅ otherwise

Relations (iscc: <<)

⇒ binary relation on domains reflecting lexicographic order of images

A = {S1(i1)→ T1(j1) : f (i1, j1) }B = {S2(i2)→ T2(j2) : g(i2, j2) }A ≺ B ={S1(i1)→ S2(i2) : ∃j1, j2 : f (i1, j1) ∧ g(i2, j2) ∧ j1 ≺ j2 }

if T1(j1) = T2(j2)

∅ otherwise

Page 105: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 40 / 121

Lexicographic Order

Sets (iscc: <<)

A = { S(i) : f (i) }B = {T (j) : g(j) }

A ≺ B =

{{ S(i)→ S(j) : f (i) ∧ g(j) ∧ i ≺ j } if S(i) = T (j)

∅ otherwise

Relations (iscc: <<)

⇒ binary relation on domains reflecting lexicographic order of images

A = { S1(i1)→ T1(j1) : f (i1, j1) }B = { S2(i2)→ T2(j2) : g(i2, j2) }A ≺ B ={ S1(i1)→ S2(i2) : ∃j1, j2 : f (i1, j1) ∧ g(i2, j2) ∧ j1 ≺ j2 }

if T1(j1) = T2(j2)

∅ otherwise

Page 106: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 41 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

O << O;

{ S2[i, j, k] -> S2[i’, j’, k’] : i’ >= 1 + i;

S2[i, j, k] -> S2[i, j’, k’] : j’ >= 1 + j;

S2[i, j, k] -> S2[i, j, k’] : k’ >= 1 + k;

S1[i, j] -> S2[i’, j’, k] : i’ >= 1 + i;

S1[i, j] -> S2[i, j’, k] : j’ >= 1 + j;

S1[i, j] -> S2[i, j, k]; S2[i, j, k] -> S1[i’, j’] : i’ >= 1 + i;

S2[i, j, k] -> S1[i, j’] : j’ >= 1 + j;

S1[i, j] -> S1[i’, j’] : i’ >= 1 + i;

S1[i, j] -> S1[i, j’] : j’ >= 1 + j }

Page 107: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 41 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

O << O;

{ S2[i, j, k] -> S2[i’, j’, k’] : i’ >= 1 + i;

S2[i, j, k] -> S2[i, j’, k’] : j’ >= 1 + j;

S2[i, j, k] -> S2[i, j, k’] : k’ >= 1 + k;

S1[i, j] -> S2[i’, j’, k] : i’ >= 1 + i;

S1[i, j] -> S2[i, j’, k] : j’ >= 1 + j;

S1[i, j] -> S2[i, j, k]; S2[i, j, k] -> S1[i’, j’] : i’ >= 1 + i;

S2[i, j, k] -> S1[i, j’] : j’ >= 1 + j;

S1[i, j] -> S1[i’, j’] : i’ >= 1 + i;

S1[i, j] -> S1[i, j’] : j’ >= 1 + j }

Page 108: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 41 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

O << O;

{ S2[i, j, k] -> S2[i’, j’, k’] : i’ >= 1 + i;

S2[i, j, k] -> S2[i, j’, k’] : j’ >= 1 + j;

S2[i, j, k] -> S2[i, j, k’] : k’ >= 1 + k;

S1[i, j] -> S2[i’, j’, k] : i’ >= 1 + i;

S1[i, j] -> S2[i, j’, k] : j’ >= 1 + j;

S1[i, j] -> S2[i, j, k]; S2[i, j, k] -> S1[i’, j’] : i’ >= 1 + i;

S2[i, j, k] -> S1[i, j’] : j’ >= 1 + j;

S1[i, j] -> S1[i’, j’] : i’ >= 1 + i;

S1[i, j] -> S1[i, j’] : j’ >= 1 + j }

Page 109: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 42 / 121

Dependence AnalysisRecall: sufficient condition for schedule validity

∀S(i)→ T (j) ∈ D : O(S(i),T (j)) < 0

Dependence relation D: pairs of statement instances

accessing the same memory location

of which at least one is a write

with the first executed before the second

Computation:

D =((

W−1 ◦ R)∪(W−1 ◦W

)∪(R−1 ◦W

))∩ (O ≺ O)

W : write access relation

R: read access relation

O: original schedule

Page 110: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 42 / 121

Dependence AnalysisRecall: sufficient condition for schedule validity

∀S(i)→ T (j) ∈ D : O(S(i),T (j)) < 0

Dependence relation D: pairs of statement instances

accessing the same memory location

of which at least one is a write

with the first executed before the second

Computation:

D =((

W−1 ◦ R)∪(W−1 ◦W

)∪(R−1 ◦W

))∩ (O ≺ O)

W : write access relation

R: read access relation

O: original schedule

Page 111: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 43 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

W := { S1[i,j] -> C[i,j]; S2[i,j,k] -> C[i,j] };

R := { S2[i,j,k] -> C[i,j]; S2[i,j,k] -> A[i,k];

S2[i,j,k] -> B[k,j] };

I := [M,N,K] -> { S1[i,j] : 0 <= i < M and 0 <= j < N;

S2[i,j,k] : 0 <= i < M and 0 <= j < N and 0 <= k < K };

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

((R . W^-1) + (W . W^-1) + (W . R^-1)) * (O << O);

{ S2[i, j, k] -> S2[i, j, k’] : k’ >= 1 + k;

S1[i, j] -> S2[i, j, k] }

Page 112: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 43 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

W := { S1[i,j] -> C[i,j]; S2[i,j,k] -> C[i,j] };

R := { S2[i,j,k] -> C[i,j]; S2[i,j,k] -> A[i,k];

S2[i,j,k] -> B[k,j] };

I := [M,N,K] -> { S1[i,j] : 0 <= i < M and 0 <= j < N;

S2[i,j,k] : 0 <= i < M and 0 <= j < N and 0 <= k < K };

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

((R . W^-1) + (W . W^-1) + (W . R^-1)) * (O << O);

{ S2[i, j, k] -> S2[i, j, k’] : k’ >= 1 + k;

S1[i, j] -> S2[i, j, k] }

Page 113: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 43 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

W := { S1[i,j] -> C[i,j]; S2[i,j,k] -> C[i,j] };

R := { S2[i,j,k] -> C[i,j]; S2[i,j,k] -> A[i,k];

S2[i,j,k] -> B[k,j] };

I := [M,N,K] -> { S1[i,j] : 0 <= i < M and 0 <= j < N;

S2[i,j,k] : 0 <= i < M and 0 <= j < N and 0 <= k < K };

O := { S1[i,j] -> [i,j,0,0]; S2[i,j,k] -> [i,j,1,k] };

((R . W^-1) + (W . W^-1) + (W . R^-1)) * (O << O);

{ S2[i, j, k] -> S2[i, j, k’] : k’ >= 1 + k;

S1[i, j] -> S2[i, j, k] }

Page 114: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 44 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

Compute dependence relation

{}

Page 115: Polyhedral Compilation without Polyhedra - Inria

Dependences Dependences January 20, 2015 44 / 121

Exerciseint f1(int m, int n, int A[const static m][n])

{

int t = 0;

for (int i = 0; i < m; ++i)

t += A[i][i];

return t;

}

void f2(int m, int n, int A[/*.*/m][n][n], int B[/*.*/m][n])

{

for (int i = 0; i < m; ++i) {

S: B[i][0] = 0;

for (int j = 0; j < n; ++j) {

if (j == i)

continue;

T: B[i][j] = f1(j, n, A[i]);

}

}

}

Compute dependence relation {}

Page 116: Polyhedral Compilation without Polyhedra - Inria

Dependences Structures January 20, 2015 45 / 121

Accesses to Structure Fields and Nested RelationsNo special treatment is needed for representing accesses to structure fields

⇒ structure field encoded in name

and/or structure

of target space ofaccess relations

struct s {

int a;

int b[10];

};

void f(struct s s[const static 10][10])

{

for (int i = 0; i < 10; ++i)

S: s[i][i].b[9 - i] = 0;

}

{ S(i)→ s b(i , i , 9− i) }

In pet, structure encoded in nested relation

{ S(i)→ s b(s(i , i)→ b(9− i)) }

Page 117: Polyhedral Compilation without Polyhedra - Inria

Dependences Structures January 20, 2015 45 / 121

Accesses to Structure Fields and Nested RelationsNo special treatment is needed for representing accesses to structure fields

⇒ structure field encoded in name

and/or structure

of target space ofaccess relations

struct s {

int a;

int b[10];

};

void f(struct s s[const static 10][10])

{

for (int i = 0; i < 10; ++i)

S: s[i][i].b[9 - i] = 0;

}

{ S(i)→ s b(i , i , 9− i) }

In pet, structure encoded in nested relation

{ S(i)→ s b(s(i , i)→ b(9− i)) }

Page 118: Polyhedral Compilation without Polyhedra - Inria

Dependences Structures January 20, 2015 45 / 121

Accesses to Structure Fields and Nested RelationsNo special treatment is needed for representing accesses to structure fields

⇒ structure field encoded in name and/or structure of target space ofaccess relations

struct s {

int a;

int b[10];

};

void f(struct s s[const static 10][10])

{

for (int i = 0; i < 10; ++i)

S: s[i][i].b[9 - i] = 0;

}

{ S(i)→ s b(i , i , 9− i) }

In pet, structure encoded in nested relation

{ S(i)→ s b(s(i , i)→ b(9− i)) }

Page 119: Polyhedral Compilation without Polyhedra - Inria

Dependences Structures January 20, 2015 46 / 121

Accesses to StructuresDependence analysis needs to take into account that access to structurerepresents access to all fields of structure

struct c {

float re;

float im;

};

void f(struct c A[const static 10])

{

S: A[0] = A[2];

T: A[1].re = A[0].im;

}

Write access relation: { S()→ A(0); T()→ A re(A(1)→ re()) }

Expansion: { A(a)→ A re(A(a)→ re()); A(a)→ A im(A(a)→ im()); }Expanded write access relation:{ S()→ A re(A(0)→ re()); S()→ A im(A(0)→ im());

T()→ A re(A(1)→ re()) }

Page 120: Polyhedral Compilation without Polyhedra - Inria

Dependences Structures January 20, 2015 46 / 121

Accesses to StructuresDependence analysis needs to take into account that access to structurerepresents access to all fields of structure

struct c {

float re;

float im;

};

void f(struct c A[const static 10])

{

S: A[0] = A[2];

T: A[1].re = A[0].im;

}

Write access relation: { S()→ A(0); T()→ A re(A(1)→ re()) }Expansion: { A(a)→ A re(A(a)→ re()); A(a)→ A im(A(a)→ im()); }Expanded write access relation:{ S()→ A re(A(0)→ re()); S()→ A im(A(0)→ im());

T()→ A re(A(1)→ re()) }

Page 121: Polyhedral Compilation without Polyhedra - Inria

Dataflow January 20, 2015 47 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 122: Polyhedral Compilation without Polyhedra - Inria

Dataflow Parallelism January 20, 2015 48 / 121

Schedule Optimization Criteria

Typical optimization criteria

increase parallelism

increase locality

reduce memory requirements

Parallelism:Pairs of statement instances S(i) and T (j) may be executed in parallelif they do not depend on each other:

{ S(i)→ T (j); T (j)→ S(i) } ∩ D = ∅

Page 123: Polyhedral Compilation without Polyhedra - Inria

Dataflow Parallelism January 20, 2015 48 / 121

Schedule Optimization Criteria

Typical optimization criteria

increase parallelism

increase locality

reduce memory requirements

Parallelism:Pairs of statement instances S(i) and T (j) may be executed in parallelif they do not depend on each other:

{ S(i)→ T (j); T (j)→ S(i) } ∩ D = ∅

Page 124: Polyhedral Compilation without Polyhedra - Inria

Dataflow Parallelism January 20, 2015 49 / 121

Local Parallelism

Global parallelism

{ S(i)→ T (j); T (j)→ S(i) } ∩ D = ∅

Local parallelism

{ S(i)→ T (j); T (j)→ S(i) } ∩ L = ∅

Root of schedule tree: L = D

Child of band node b with partial schedule P:

L = Lb ∩ { S(i)→ T (j) : P(S(i)) = P(T (j)) }Child of sequence node s with instance set F

L = Ls ∩ { S(i)→ T (j) : S(i) ∈ F ∧ T (j) ∈ F }

Page 125: Polyhedral Compilation without Polyhedra - Inria

Dataflow Parallelism January 20, 2015 50 / 121

Encoding Parallelism

Coarse grain parallelism (root)I placement maps statement instances to virtual processorsI schedule interpreted within each virtual processor

Fine grain parallelism (leaf)I statement instances S(i) and T (j) for which O(S(i),T (j)) = 0 may be

executed in parallel within leaf that contains both

General case (arbitrary position in schedule tree)I schedules defines total order, but,I some sequence nodes and/or some band dimensions are explicitly

marked parallel

Page 126: Polyhedral Compilation without Polyhedra - Inria

Dataflow Parallelism January 20, 2015 50 / 121

Encoding Parallelism

Coarse grain parallelism (root)I placement maps statement instances to virtual processorsI schedule interpreted within each virtual processor

Fine grain parallelism (leaf)I statement instances S(i) and T (j) for which O(S(i),T (j)) = 0 may be

executed in parallel within leaf that contains both

General case (arbitrary position in schedule tree)I schedules defines total order, but,I some sequence nodes and/or some band dimensions are explicitly

marked parallel

Page 127: Polyhedral Compilation without Polyhedra - Inria

Dataflow Parallelism January 20, 2015 50 / 121

Encoding Parallelism

Coarse grain parallelism (root)I placement maps statement instances to virtual processorsI schedule interpreted within each virtual processor

Fine grain parallelism (leaf)I statement instances S(i) and T (j) for which O(S(i),T (j)) = 0 may be

executed in parallel within leaf that contains both

General case (arbitrary position in schedule tree)I schedules defines total order, but,I some sequence nodes and/or some band dimensions are explicitly

marked parallel

Page 128: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 51 / 121

False Dependencesfor (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dependences

read after write (“true”): { S(i)→ T(i ′) : i ′ ≥ i }write after read (“anti”): { T(i)→ S(i ′) : i ′ > i }write after write (“output”): { S(i)→ S(i ′) : i ′ > i }

“false”

False dependences not from dataflow, but from reuse of memory location t

Possible solution: expansion/privatizationfor (int i = 0; i < n; ++i) {

S: t[i] = f1(A[i]);

T: B[i] = f2(t[i]);

}

dataflow (subset of “true” dependences): { S(i)→ T(i) }

Page 129: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 51 / 121

False Dependencesfor (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dependences

read after write (“true”): { S(i)→ T(i ′) : i ′ ≥ i }write after read (“anti”): { T(i)→ S(i ′) : i ′ > i }write after write (“output”): { S(i)→ S(i ′) : i ′ > i }“false”

False dependences not from dataflow, but from reuse of memory location t

Possible solution: expansion/privatizationfor (int i = 0; i < n; ++i) {

S: t[i] = f1(A[i]);

T: B[i] = f2(t[i]);

}

dataflow (subset of “true” dependences): { S(i)→ T(i) }

Page 130: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 51 / 121

False Dependencesfor (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dependences

read after write (“true”): { S(i)→ T(i ′) : i ′ ≥ i }write after read (“anti”): { T(i)→ S(i ′) : i ′ > i }write after write (“output”): { S(i)→ S(i ′) : i ′ > i }“false”

False dependences not from dataflow, but from reuse of memory location t

Possible solution: expansion/privatizationfor (int i = 0; i < n; ++i) {

S: t[i] = f1(A[i]);

T: B[i] = f2(t[i]);

}

dataflow (subset of “true” dependences): { S(i)→ T(i) }

Page 131: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 51 / 121

False Dependencesfor (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dependences

read after write (“true”): { S(i)→ T(i ′) : i ′ ≥ i }write after read (“anti”): { T(i)→ S(i ′) : i ′ > i }write after write (“output”): { S(i)→ S(i ′) : i ′ > i }“false”

False dependences not from dataflow, but from reuse of memory location t

Possible solution: expansion/privatizationfor (int i = 0; i < n; ++i) {

S: t[i] = f1(A[i]);

T: B[i] = f2(t[i]);

}

dataflow (subset of “true” dependences): { S(i)→ T(i) }

Page 132: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 52 / 121

Lexicographic Optimization (Space Local)

Lexicographic Minimum of Sets (iscc: lexmin)

S = {S(i) : f (i) }lexmin S = { S(i) : f (i) ∧ ∀i′ : f (i′)⇒ i 4 i′ }I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }lexmin I = { R(); S(0, 0); T(0) }

Lexicographic Maximum of Sets (iscc: lexmax)

S = {S(i) : f (i) }lexmax S = { S(i) : f (i) ∧ ∀i′ : f (i′)⇒ i < i′ }lexmax I = { R(); S(1, 1); T(1) }Lexicographic Maximum of Relations (iscc: lexmax)

R = {S(i)→ T (j) : f (i, j) }lexmaxR = { S(i)→ T (j) : f (i, j) ∧ ∀j′ : f (i, j′)⇒ j < j′ }W−1 = { A(a)→ S(i , j) : a = i + j ∧ 0 ≤ i < 2 ∧ 0 ≤ j < 2 }lexmax(W−1) = { A(a)→ S(a, 0) : 0 ≤ a ≤ 1; A(2)→ S(1, 1) }⇒ last statement instance writing array element

Page 133: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 52 / 121

Lexicographic Optimization (Space Local)

Lexicographic Minimum of Sets (iscc: lexmin)

S = {S(i) : f (i) }lexmin S = { S(i) : f (i) ∧ ∀i′ : f (i′)⇒ i 4 i′ }I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }lexmin I = { R(); S(0, 0); T(0) }Lexicographic Maximum of Sets (iscc: lexmax)

S = {S(i) : f (i) }lexmax S = { S(i) : f (i) ∧ ∀i′ : f (i′)⇒ i < i′ }lexmax I = { R(); S(1, 1); T(1) }

Lexicographic Maximum of Relations (iscc: lexmax)

R = {S(i)→ T (j) : f (i, j) }lexmaxR = { S(i)→ T (j) : f (i, j) ∧ ∀j′ : f (i, j′)⇒ j < j′ }W−1 = { A(a)→ S(i , j) : a = i + j ∧ 0 ≤ i < 2 ∧ 0 ≤ j < 2 }lexmax(W−1) = { A(a)→ S(a, 0) : 0 ≤ a ≤ 1; A(2)→ S(1, 1) }⇒ last statement instance writing array element

Page 134: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 52 / 121

Lexicographic Optimization (Space Local)

Lexicographic Minimum of Sets (iscc: lexmin)

S = {S(i) : f (i) }lexmin S = { S(i) : f (i) ∧ ∀i′ : f (i′)⇒ i 4 i′ }I = { R(); S(i , j) : 0 ≤ i < 2 ∧ 0 ≤ j < 2; T(k) : 0 ≤ k < 2 }lexmin I = { R(); S(0, 0); T(0) }Lexicographic Maximum of Sets (iscc: lexmax)

S = {S(i) : f (i) }lexmax S = { S(i) : f (i) ∧ ∀i′ : f (i′)⇒ i < i′ }lexmax I = { R(); S(1, 1); T(1) }Lexicographic Maximum of Relations (iscc: lexmax)

R = { S(i)→ T (j) : f (i, j) }lexmaxR = { S(i)→ T (j) : f (i, j) ∧ ∀j′ : f (i, j′)⇒ j < j′ }W−1 = { A(a)→ S(i , j) : a = i + j ∧ 0 ≤ i < 2 ∧ 0 ≤ j < 2 }lexmax(W−1) = { A(a)→ S(a, 0) : 0 ≤ a ≤ 1; A(2)→ S(1, 1) }⇒ last statement instance writing array element

Page 135: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 53 / 121

Array Dataflow Analysis

Given a read from an array element, what was the last write tothe same array element before the read?

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

F: a[i+j] = f(a[i+j]);

for (i = 0; i < N; ++i)

W: Write(a[i]);

F

W

a

A1

A2

Access relations:A1 = { F(i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }A2 = { W(i)→ a(i) : 0 ≤ i < N }Map to all writes:R ′ = A1

−1 ◦ A2 = { W(i)→ F(i ′, i − i ′) : 0 ≤ i ′ ≤ i < N }Last write: R = lexmaxR ′ = { W(i)→ F(i , 0) : 0 ≤ i < N }In general: impose lexicographical order on shared branch of schedule tree

Page 136: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 53 / 121

Array Dataflow Analysis

Given a read from an array element, what was the last write tothe same array element before the read?

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

F: a[i+j] = f(a[i+j]);

for (i = 0; i < N; ++i)

W: Write(a[i]);

F

W

a

A1

A2

Access relations:A1 = { F(i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }A2 = { W(i)→ a(i) : 0 ≤ i < N }Map to all writes:R ′ = A1

−1 ◦ A2 = { W(i)→ F(i ′, i − i ′) : 0 ≤ i ′ ≤ i < N }Last write: R = lexmaxR ′ = { W(i)→ F(i , 0) : 0 ≤ i < N }In general: impose lexicographical order on shared branch of schedule tree

Page 137: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 53 / 121

Array Dataflow Analysis

Given a read from an array element, what was the last write tothe same array element before the read?

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

F: a[i+j] = f(a[i+j]);

for (i = 0; i < N; ++i)

W: Write(a[i]);

F

W

a

A1

A2

Access relations:A1 = { F(i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }A2 = { W(i)→ a(i) : 0 ≤ i < N }

Map to all writes:R ′ = A1

−1 ◦ A2 = { W(i)→ F(i ′, i − i ′) : 0 ≤ i ′ ≤ i < N }Last write: R = lexmaxR ′ = { W(i)→ F(i , 0) : 0 ≤ i < N }In general: impose lexicographical order on shared branch of schedule tree

Page 138: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 53 / 121

Array Dataflow Analysis

Given a read from an array element, what was the last write tothe same array element before the read?

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

F: a[i+j] = f(a[i+j]);

for (i = 0; i < N; ++i)

W: Write(a[i]);

F

W

a

A1

A2

Access relations:A1 = { F(i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }A2 = { W(i)→ a(i) : 0 ≤ i < N }Map to all writes:R ′ = A1

−1 ◦ A2 = { W(i)→ F(i ′, i − i ′) : 0 ≤ i ′ ≤ i < N }

Last write: R = lexmaxR ′ = { W(i)→ F(i , 0) : 0 ≤ i < N }In general: impose lexicographical order on shared branch of schedule tree

Page 139: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 53 / 121

Array Dataflow Analysis

Given a read from an array element, what was the last write tothe same array element before the read?

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

F: a[i+j] = f(a[i+j]);

for (i = 0; i < N; ++i)

W: Write(a[i]);

F

W

a

A1

A2

Access relations:A1 = { F(i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }A2 = { W(i)→ a(i) : 0 ≤ i < N }Map to all writes:R ′ = A1

−1 ◦ A2 = { W(i)→ F(i ′, i − i ′) : 0 ≤ i ′ ≤ i < N }Last write: R = lexmaxR ′ = { W(i)→ F(i , 0) : 0 ≤ i < N }

In general: impose lexicographical order on shared branch of schedule tree

Page 140: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 53 / 121

Array Dataflow Analysis

Given a read from an array element, what was the last write tothe same array element before the read?

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

F: a[i+j] = f(a[i+j]);

for (i = 0; i < N; ++i)

W: Write(a[i]);

F

W

a

A1

A2

Access relations:A1 = { F(i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }A2 = { W(i)→ a(i) : 0 ≤ i < N }Map to all writes:R ′ = A1

−1 ◦ A2 = { W(i)→ F(i ′, i − i ′) : 0 ≤ i ′ ≤ i < N }Last write: R = lexmaxR ′ = { W(i)→ F(i , 0) : 0 ≤ i < N }In general: impose lexicographical order on shared branch of schedule tree

Page 141: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 54 / 121

ExpansionAssume:

instance sets and access relations are static and exact⇒ each read has exactly one corresponding write

single read and write per statement⇒ expanded array indexed by statement instance of write

for (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dataflow: { S(i)→ T(i) }

for (int i = 0; i < n; ++i) {

S: S[i] = f1(A[i]);

T: B[i] = f2(S[i]);

}

⇒ only remaining dependences are dataflow induced

Page 142: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 54 / 121

ExpansionAssume:

instance sets and access relations are static and exact⇒ each read has exactly one corresponding write

single read and write per statement⇒ expanded array indexed by statement instance of write

for (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dataflow: { S(i)→ T(i) }

for (int i = 0; i < n; ++i) {

S: S[i] = f1(A[i]);

T: B[i] = f2(S[i]);

}

⇒ only remaining dependences are dataflow induced

Page 143: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 54 / 121

ExpansionAssume:

instance sets and access relations are static and exact⇒ each read has exactly one corresponding write

single read and write per statement⇒ expanded array indexed by statement instance of write

for (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

Dataflow: { S(i)→ T(i) }

for (int i = 0; i < n; ++i) {

S: S[i] = f1(A[i]);

T: B[i] = f2(S[i]);

}

⇒ only remaining dependences are dataflow induced

Page 144: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 55 / 121

Tagged Access Relations

Expansion in case of multiple reads or writes per statement

⇒ statement instance not enough to identify memory access

⇒ use identifier of array reference instead

⇒ pet: embed array reference in “tagged” instance set

⇒ “ternary” access relations

I → R → A

I : statement instanceR: reference identifierO: array element

⇒ in practice: nested binary relation

(I → R)→ A

For example, { (S(i)→ R1())→ A(i) }

Page 145: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 55 / 121

Tagged Access Relations

Expansion in case of multiple reads or writes per statement

⇒ statement instance not enough to identify memory access

⇒ use identifier of array reference instead

⇒ pet: embed array reference in “tagged” instance set

⇒ “ternary” access relations

I → R → A

I : statement instanceR: reference identifierO: array element

⇒ in practice: nested binary relation

(I → R)→ A

For example, { (S(i)→ R1())→ A(i) }

Page 146: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 56 / 121

Wrapping, Unwrapping, Domain Map and Range Map

R = { S(i)→ T (j) : f (i, j) } S = { (U(i)→ V (j)) : g(i, j) }

Wrap (iscc: wrap)

WR = { (S(i)→ T (j)) : f (i, j) }

Unwrap (iscc: unwrap)

W−1S = {U(i)→ V (j) : g(i, j) }Domain map (iscc: domain_map)

dom−−→R = { (S(i)→ T (j))→ S(i) : f (i, j) }

I = { (S(i)→ R1()) : 0 ≤ i < n }dom−−→(W−1I ) = { (S(i)→ R1())→ S(i) : 0 ≤ i < n }⇒ maps tagged instance set to untagged instance set⇒ precompose with instance set based relations to obtain

tagged instance set based relations

Range map (iscc: range_map)

ran−→R = { (S(i)→ T (j))→ T (j) : f (i, j) }

Page 147: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 56 / 121

Wrapping, Unwrapping, Domain Map and Range Map

R = { S(i)→ T (j) : f (i, j) } S = { (U(i)→ V (j)) : g(i, j) }

Wrap (iscc: wrap)

WR = { (S(i)→ T (j)) : f (i, j) }Unwrap (iscc: unwrap)

W−1S = {U(i)→ V (j) : g(i, j) }

Domain map (iscc: domain_map)

dom−−→R = { (S(i)→ T (j))→ S(i) : f (i, j) }

I = { (S(i)→ R1()) : 0 ≤ i < n }dom−−→(W−1I ) = { (S(i)→ R1())→ S(i) : 0 ≤ i < n }⇒ maps tagged instance set to untagged instance set⇒ precompose with instance set based relations to obtain

tagged instance set based relations

Range map (iscc: range_map)

ran−→R = { (S(i)→ T (j))→ T (j) : f (i, j) }

Page 148: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 56 / 121

Wrapping, Unwrapping, Domain Map and Range Map

R = { S(i)→ T (j) : f (i, j) } S = { (U(i)→ V (j)) : g(i, j) }

Wrap (iscc: wrap)

WR = { (S(i)→ T (j)) : f (i, j) }Unwrap (iscc: unwrap)

W−1S = {U(i)→ V (j) : g(i, j) }Domain map (iscc: domain_map)

dom−−→R = { (S(i)→ T (j))→ S(i) : f (i, j) }

I = { (S(i)→ R1()) : 0 ≤ i < n }dom−−→(W−1I ) = { (S(i)→ R1())→ S(i) : 0 ≤ i < n }⇒ maps tagged instance set to untagged instance set⇒ precompose with instance set based relations to obtain

tagged instance set based relations

Range map (iscc: range_map)

ran−→R = { (S(i)→ T (j))→ T (j) : f (i, j) }

Page 149: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 56 / 121

Wrapping, Unwrapping, Domain Map and Range Map

R = { S(i)→ T (j) : f (i, j) } S = { (U(i)→ V (j)) : g(i, j) }

Wrap (iscc: wrap)

WR = { (S(i)→ T (j)) : f (i, j) }Unwrap (iscc: unwrap)

W−1S = {U(i)→ V (j) : g(i, j) }Domain map (iscc: domain_map)

dom−−→R = { (S(i)→ T (j))→ S(i) : f (i, j) }

I = { (S(i)→ R1()) : 0 ≤ i < n }dom−−→(W−1I ) = { (S(i)→ R1())→ S(i) : 0 ≤ i < n }⇒ maps tagged instance set to untagged instance set⇒ precompose with instance set based relations to obtain

tagged instance set based relations

Range map (iscc: range_map)

ran−→R = { (S(i)→ T (j))→ T (j) : f (i, j) }

Page 150: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 57 / 121

Parametric Example: Matrix Multiplicationfor (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

Tagged Access Relations

W = { (S1(i , j)→ R0())→ C(i , j); (S2(i , j , k)→ R1())→ C(i , j) }R = { (S2(i , j , k)→ R2())→ C(i , j); (S2(i , j , k)→ R3())→ A(i , k);

(S2(i , j , k)→ R4())→ B(k , j) }Tagged Schedule

{ (S1(i , j)→ R0())→ (i , j , 0, 0); (S2(i , j , k)→ R1())→ (i , j , 1, k)

(S2(i , j , k)→ R2())→ (i , j , 1, k); (S2(i , j , k)→ R3())→ (i , j , 1, k);

(S2(i , j , k)→ R4())→ (i , j , 1, k) }Tagged Dataflow

{ (S1(i , j)→ R0())→ (S2(i , j , 0)→ R2())

(S2(i , j , k)→ R1())→ (S2(i , j , k + 1)→ R2()) }

Page 151: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 58 / 121

Maximal Static Expansion [3]

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

t1[i] = f1(i);

A[i] = t1[i];

t2[i] = f2(i);

if (f3(i))

t2[i] = f4(i);

B[i] = t2[i];

Dataflow cannot be determined independently of run-time information

⇒ approximate dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

⇒ a read may be associated to more than one write

⇒ corresponding equivalence classes should not be expanded apart

Page 152: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 58 / 121

Maximal Static Expansion [3]

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

t1[i] = f1(i);

A[i] = t1[i];

t2[i] = f2(i);

if (f3(i))

t2[i] = f4(i);

B[i] = t2[i];

Dataflow cannot be determined independently of run-time information

⇒ approximate dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

⇒ a read may be associated to more than one write

⇒ corresponding equivalence classes should not be expanded apart

Page 153: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 58 / 121

Maximal Static Expansion [3]

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

t1[i] = f1(i);

A[i] = t1[i];

t2[i] = f2(i);

if (f3(i))

t2[i] = f4(i);

B[i] = t2[i];

Dataflow cannot be determined independently of run-time information

⇒ approximate dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }⇒ a read may be associated to more than one write

⇒ corresponding equivalence classes should not be expanded apart

Page 154: Polyhedral Compilation without Polyhedra - Inria

Dataflow Dataflow January 20, 2015 58 / 121

Maximal Static Expansion [3]

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

t1[i] = f1(i);

A[i] = t1[i];

t2[i] = f2(i);

if (f3(i))

t2[i] = f4(i);

B[i] = t2[i];

Dataflow cannot be determined independently of run-time information

⇒ approximate dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }⇒ a read may be associated to more than one write

⇒ corresponding equivalence classes should not be expanded apart

Page 155: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 59 / 121

Approximate Dataflow Analysis

How to compute dataflow in presence of data dependent control?

Two approaches

Direct computationI distinguish between may- and must-writes

Derived from exact run-time dependent dataflowI compute exact dataflow in terms of run-time informationI exploit properties of run-time informationI project out run-time information

Page 156: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 59 / 121

Approximate Dataflow Analysis

How to compute dataflow in presence of data dependent control?

Two approaches

Direct computationI distinguish between may- and must-writes

Derived from exact run-time dependent dataflowI compute exact dataflow in terms of run-time informationI exploit properties of run-time informationI project out run-time information

Page 157: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 59 / 121

Approximate Dataflow Analysis

How to compute dataflow in presence of data dependent control?

Two approaches

Direct computationI distinguish between may- and must-writes

Derived from exact run-time dependent dataflowI compute exact dataflow in terms of run-time informationI exploit properties of run-time informationI project out run-time information

Page 158: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 60 / 121

May Writes

Keep track of whether write is possible or definite

Must-writesArray elements that are definitely accessed by statement instance

May-writesArray elements that are possibly accessed by statement instance

I statement instance not necessarily executedfor (i = 0; i < n; ++i)

if (A[i] > 0)

S: B[i] = A[i];

May-write: { S(i)→ B(i) }I array element not necessarily accessed

int A[N];

/* ... */

T: A[B[0]] = 5;

May-write: { T()→ A(a) : 0 ≤ a < N }

Must-write access relation is subset of may-write access relation

Page 159: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 60 / 121

May Writes

Keep track of whether write is possible or definite

Must-writesArray elements that are definitely accessed by statement instance

May-writesArray elements that are possibly accessed by statement instance

I statement instance not necessarily executedfor (i = 0; i < n; ++i)

if (A[i] > 0)

S: B[i] = A[i];

May-write: { S(i)→ B(i) }

I array element not necessarily accessedint A[N];

/* ... */

T: A[B[0]] = 5;

May-write: { T()→ A(a) : 0 ≤ a < N }

Must-write access relation is subset of may-write access relation

Page 160: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 60 / 121

May Writes

Keep track of whether write is possible or definite

Must-writesArray elements that are definitely accessed by statement instance

May-writesArray elements that are possibly accessed by statement instance

I statement instance not necessarily executedfor (i = 0; i < n; ++i)

if (A[i] > 0)

S: B[i] = A[i];

May-write: { S(i)→ B(i) }I array element not necessarily accessed

int A[N];

/* ... */

T: A[B[0]] = 5;

May-write: { T()→ A(a) : 0 ≤ a < N }

Must-write access relation is subset of may-write access relation

Page 161: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 61 / 121

Approximate Dataflow — Direct Computation

Read after write dependences

I write and read access same memory locationI write executed before the read

⇒ Approximate dataflow analysis with no must-writes

Dataflow dependences

I write and read access same memory locationI write executed before the readI no intermediate write to same memory location⇒ intermediate write kills dependence

Approximate dataflow dependences

I may-write and read access same memory locationI may-write executed before the readI no intermediate must-write to same memory location⇒ intermediate must-write kills dependence

Page 162: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 61 / 121

Approximate Dataflow — Direct Computation

Read after write dependences

I write and read access same memory locationI write executed before the read

⇒ Approximate dataflow analysis with no must-writes

Dataflow dependences

I write and read access same memory locationI write executed before the readI no intermediate write to same memory location⇒ intermediate write kills dependence

Approximate dataflow dependences

I may-write and read access same memory locationI may-write executed before the readI no intermediate must-write to same memory location⇒ intermediate must-write kills dependence

Page 163: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 61 / 121

Approximate Dataflow — Direct Computation

Read after write dependences

I write and read access same memory locationI write executed before the read

⇒ Approximate dataflow analysis with no must-writes

Dataflow dependences

I write and read access same memory locationI write executed before the readI no intermediate write to same memory location⇒ intermediate write kills dependence

Approximate dataflow dependences

I may-write and read access same memory locationI may-write executed before the readI no intermediate must-write to same memory location⇒ intermediate must-write kills dependence

Page 164: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 61 / 121

Approximate Dataflow — Direct Computation

Read after write dependences

I write and read access same memory locationI write executed before the read

⇒ Approximate dataflow analysis with no must-writes

Dataflow dependences

I write and read access same memory locationI write executed before the readI no intermediate write to same memory location⇒ intermediate write kills dependence

Approximate dataflow dependences

I may-write and read access same memory locationI may-write executed before the readI no intermediate must-write to same memory location⇒ intermediate must-write kills dependence

Page 165: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 62 / 121

Performing Dataflow Analysis [9, 17]

Two possibilities1 dataflow analysis on polyhedral model

I first extract instance set, access relations and scheduleI then perform dataflow analysis

E.g., isl

2 dataflow analysis on AST before/during model extractionProposed by, e.g., Maslov (1994)

Dataflow in Parallel Programs

1 use refined execution order during dataflow analysis

2 remove spurious dependences after dataflow analysis

Note: dataflow from must-writes cannot be removed without caution

⇒ must-write may have killed other dependences⇒ other dependences may have to be added back

Page 166: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 62 / 121

Performing Dataflow Analysis [9, 17]

Two possibilities1 dataflow analysis on polyhedral model

I first extract instance set, access relations and scheduleI then perform dataflow analysis

E.g., isl

2 dataflow analysis on AST before/during model extractionProposed by, e.g., Maslov (1994)

Dataflow in Parallel Programs

1 use refined execution order during dataflow analysis

2 remove spurious dependences after dataflow analysis

Note: dataflow from must-writes cannot be removed without caution

⇒ must-write may have killed other dependences⇒ other dependences may have to be added back

Page 167: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 63 / 121

Explicit Kills

int A[N];

S: A[0] = 1;

__pencil_kill(A);

for (int i = 0; i < N; ++i)

T: A[perm[i]] = f(i);

U: f(A[0]);

Assume perm represents a permutation

⇒ there can be no dataflow from S to U

Compiler does not know all elements are written by T

⇒ may find dataflow from S to U

⇒ user can insert explicit kill⇒ explicit kill used to kill dependences, just like must-write

Page 168: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 63 / 121

Explicit Kills

int A[N];

S: A[0] = 1;

__pencil_kill(A);

for (int i = 0; i < N; ++i)

T: A[perm[i]] = f(i);

U: f(A[0]);

Assume perm represents a permutation

⇒ there can be no dataflow from S to U

Compiler does not know all elements are written by T

⇒ may find dataflow from S to U

⇒ user can insert explicit kill⇒ explicit kill used to kill dependences, just like must-write

Page 169: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 63 / 121

Explicit Kills

int A[N];

S: A[0] = 1;

__pencil_kill(A);

for (int i = 0; i < N; ++i)

T: A[perm[i]] = f(i);

U: f(A[0]);

Assume perm represents a permutation

⇒ there can be no dataflow from S to U

Compiler does not know all elements are written by T

⇒ may find dataflow from S to U

⇒ user can insert explicit kill⇒ explicit kill used to kill dependences, just like must-write

Page 170: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 64 / 121

Local Variables and Kills

for (int i = 0; i < N; ++i) {

int t;

/* ... */

}

⇒ there can be no dataflow on t across different iterations

⇒ pet automatically inserts killsI before declaration andI at end of block containing declaration

Page 171: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 65 / 121

Killing False DependencesDataflow derived from read after write dependences through killing.

Should we do the same for false dependences?

⇒ no need for validity schedule constraints

⇒ removed dependences implied by remaining dependences

Optimizing locality

typical criterion: minimize maximal dependence distance

dependence distances: {P(S(i))− P(T (j)) : S(i)→ T (j) ∈ D }D: (local) dependence relation, P: partial schedule

may also be useful for false dependences if no expansion is performed

⇒ locality of reused memory location

killing false dependences avoids critical path determined bytransitively covered dependences

⇒ allow must-writes to kill dependences, but not explicit kills

Page 172: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 65 / 121

Killing False DependencesDataflow derived from read after write dependences through killing.

Should we do the same for false dependences?

⇒ no need for validity schedule constraints

⇒ removed dependences implied by remaining dependences

Optimizing locality

typical criterion: minimize maximal dependence distance

dependence distances: {P(S(i))− P(T (j)) : S(i)→ T (j) ∈ D }D: (local) dependence relation, P: partial schedule

may also be useful for false dependences if no expansion is performed

⇒ locality of reused memory location

killing false dependences avoids critical path determined bytransitively covered dependences

⇒ allow must-writes to kill dependences, but not explicit kills

Page 173: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 65 / 121

Killing False DependencesDataflow derived from read after write dependences through killing.

Should we do the same for false dependences?

⇒ no need for validity schedule constraints

⇒ removed dependences implied by remaining dependences

Optimizing locality

typical criterion: minimize maximal dependence distance

dependence distances: {P(S(i))− P(T (j)) : S(i)→ T (j) ∈ D }D: (local) dependence relation, P: partial schedule

may also be useful for false dependences if no expansion is performed

⇒ locality of reused memory location

killing false dependences avoids critical path determined bytransitively covered dependences

⇒ allow must-writes to kill dependences, but not explicit kills

Page 174: Polyhedral Compilation without Polyhedra - Inria

Dataflow Approximate Dataflow January 20, 2015 65 / 121

Killing False DependencesDataflow derived from read after write dependences through killing.

Should we do the same for false dependences?

⇒ no need for validity schedule constraints

⇒ removed dependences implied by remaining dependences

Optimizing locality

typical criterion: minimize maximal dependence distance

dependence distances: {P(S(i))− P(T (j)) : S(i)→ T (j) ∈ D }D: (local) dependence relation, P: partial schedule

may also be useful for false dependences if no expansion is performed

⇒ locality of reused memory location

killing false dependences avoids critical path determined bytransitively covered dependences

⇒ allow must-writes to kill dependences, but not explicit kills

Page 175: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 66 / 121

Approximate Dataflow Analysis

How to compute dataflow in presence of data dependent control?

Two approaches

Direct computationI distinguish between may- and must-writes

Derived from exact run-time dependent dataflowI compute exact dataflow in terms of run-time informationI exploit properties of run-time informationI project out run-time information

Page 176: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 66 / 121

Approximate Dataflow Analysis

How to compute dataflow in presence of data dependent control?

Two approaches

Direct computationI distinguish between may- and must-writes

Derived from exact run-time dependent dataflowI compute exact dataflow in terms of run-time informationI exploit properties of run-time informationI project out run-time information

Page 177: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 67 / 121

Run-time Dependent Dataflow Analysis [4, 25]

Approaches

“fuzzy array dataflow analysis”

“on-demand-parametric array dataflow analysis”

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0; S5(i)→ S6(i) : βS5S6 = 1 }βPC : any potential source instance P is executed for sink C

λPC : last potential source instance P executed for sink C

Approximate dataflow (project out β and λ)

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

Page 178: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 67 / 121

Run-time Dependent Dataflow Analysis [4, 25]

Approaches

“fuzzy array dataflow analysis”

“on-demand-parametric array dataflow analysis”

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0; S5(i)→ S6(i) : βS5S6 = 1 }βPC : any potential source instance P is executed for sink C

λPC : last potential source instance P executed for sink C

Approximate dataflow (project out β and λ)

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

Page 179: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 67 / 121

Run-time Dependent Dataflow Analysis [4, 25]

Approaches

“fuzzy array dataflow analysis”

“on-demand-parametric array dataflow analysis”

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0; S5(i)→ S6(i) : βS5S6 = 1 }βPC : any potential source instance P is executed for sink C

λPC : last potential source instance P executed for sink C

Approximate dataflow (project out β and λ)

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

Page 180: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 67 / 121

Run-time Dependent Dataflow Analysis [4, 25]

Approaches

“fuzzy array dataflow analysis”

“on-demand-parametric array dataflow analysis”

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0; S5(i)→ S6(i) : βS5S6 = 1 }βPC : any potential source instance P is executed for sink C

λPC : last potential source instance P executed for sink C

Approximate dataflow (project out β and λ)

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

Page 181: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 67 / 121

Run-time Dependent Dataflow Analysis [4, 25]

Approaches

“fuzzy array dataflow analysis”

“on-demand-parametric array dataflow analysis”

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0; S5(i)→ S6(i) : βS5S6 = 1 }βPC : any potential source instance P is executed for sink C

λPC : last potential source instance P executed for sink C

Approximate dataflow (project out β and λ)

{ S1(i)→ S2(i); S3(i)→ S6(i); S5(i)→ S6(i) }

Page 182: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

What is instance set (restricted to A statement)?

{ A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }?⇒ no, m and n cannot be treated as symbolic constants

(they are modified inside k-loop)

{A(k , i , j) : 0 ≤ k < 100∧0 ≤ i < valueOf m(k)∧0 ≤ j < valueOf n(k)}?⇒ requires uninterpreted functions (of arity > 0)

Alternative: use overapproximation of instance set and keep track ofwhich elements are executed

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 183: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

What is instance set (restricted to A statement)?{ A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }?

⇒ no, m and n cannot be treated as symbolic constants(they are modified inside k-loop)

{A(k , i , j) : 0 ≤ k < 100∧0 ≤ i < valueOf m(k)∧0 ≤ j < valueOf n(k)}?⇒ requires uninterpreted functions (of arity > 0)

Alternative: use overapproximation of instance set and keep track ofwhich elements are executed

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 184: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

What is instance set (restricted to A statement)?{ A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }?⇒ no, m and n cannot be treated as symbolic constants

(they are modified inside k-loop)

{A(k , i , j) : 0 ≤ k < 100∧0 ≤ i < valueOf m(k)∧0 ≤ j < valueOf n(k)}?⇒ requires uninterpreted functions (of arity > 0)

Alternative: use overapproximation of instance set and keep track ofwhich elements are executed

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 185: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

What is instance set (restricted to A statement)?{ A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }?⇒ no, m and n cannot be treated as symbolic constants

(they are modified inside k-loop)

{A(k , i , j) : 0 ≤ k < 100∧0 ≤ i < valueOf m(k)∧0 ≤ j < valueOf n(k)}?

⇒ requires uninterpreted functions (of arity > 0)

Alternative: use overapproximation of instance set and keep track ofwhich elements are executed

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 186: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

What is instance set (restricted to A statement)?{ A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }?⇒ no, m and n cannot be treated as symbolic constants

(they are modified inside k-loop)

{A(k , i , j) : 0 ≤ k < 100∧0 ≤ i < valueOf m(k)∧0 ≤ j < valueOf n(k)}?⇒ requires uninterpreted functions (of arity > 0)

Alternative: use overapproximation of instance set and keep track ofwhich elements are executed

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 187: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

What is instance set (restricted to A statement)?{ A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }?⇒ no, m and n cannot be treated as symbolic constants

(they are modified inside k-loop)

{A(k , i , j) : 0 ≤ k < 100∧0 ≤ i < valueOf m(k)∧0 ≤ j < valueOf n(k)}?⇒ requires uninterpreted functions (of arity > 0)

Alternative: use overapproximation of instance set and keep track ofwhich elements are executed

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 188: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 189: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 68 / 121

Representing Dynamic ConditionsN1: n = f();

for (int k = 0; k < 100; ++k) {

M: m = g();

for (int i = 0; i < m; ++i)

for (int j = 0; j < n; ++j)

A: a[j][i] = g();

N2: n = f();

}

Instance set: { A(k , i , j) : 0 ≤ k < 100 ∧ 0 ≤ i ∧ 0 ≤ j }Filter:

I Filter access relations: reader → (writer → array element)F F A

1 ={A(k, i , j)→ (M(k)→ m())

}F F A

2 ={A(0, i , j)→ (N1()→ n()); A(k, i , j)→ (N2(k − 1)→ n()) : k ≥ 1

}I Filter value relation:

V A = { A(k, i , j)→ (m, n) : 0 ≤ k ≤ 99 ∧ 0 ≤ i < m ∧ 0 ≤ j < n }Statement instance is executed iff values written by corresponding writeaccesses (through filter access relations) satisfy filter value relation

Page 190: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 69 / 121

Parametric Array Dataflow Analysis

while (1) {

N: n = f();

a = g();

if (n < 100)

H: a = h();

if (n > 200)

T: t(a);

}

I = { H(i) : i ≥ 0; T(i) : i ≥ 0 }F H = { H(i)→ (N(i)→ n()) }V H = { H(i)→ (n) : i ≥ 0 ∧ n < 100 }F T = { T(i)→ (N(i)→ n()) }V T = { T(i)→ (n) : i ≥ 0 ∧ n > 200 }

Is there any dataflow between potential source and sink at inner level?

M = { T(i)→ H(i) }F H ◦M ⊆ F T

⇒ filter elements accessed by any potential source instance associated tosink instance forms subset of filter elements accessed by sink instance

⇒ constraints on filter values at sink also apply at corresponding potentialsource: V T ◦M−1 = { H(i)→ (n) : i ≥ 0 ∧ n > 200 }

(V T ◦M−1

)∩ V H = ∅

⇒ there can be no dataflow at inner level

potential source

sink

Page 191: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 69 / 121

Parametric Array Dataflow Analysis

while (1) {

N: n = f();

a = g();

if (n < 100)

H: a = h();

if (n > 200)

T: t(a);

}

I = { H(i) : i ≥ 0; T(i) : i ≥ 0 }F H = { H(i)→ (N(i)→ n()) }V H = { H(i)→ (n) : i ≥ 0 ∧ n < 100 }F T = { T(i)→ (N(i)→ n()) }V T = { T(i)→ (n) : i ≥ 0 ∧ n > 200 }

Is there any dataflow between potential source and sink at inner level?

M = { T(i)→ H(i) }F H ◦M ⊆ F T

⇒ filter elements accessed by any potential source instance associated tosink instance forms subset of filter elements accessed by sink instance

⇒ constraints on filter values at sink also apply at corresponding potentialsource: V T ◦M−1 = { H(i)→ (n) : i ≥ 0 ∧ n > 200 }

(V T ◦M−1

)∩ V H = ∅

⇒ there can be no dataflow at inner level

potential source

sink

Page 192: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 69 / 121

Parametric Array Dataflow Analysis

while (1) {

N: n = f();

a = g();

if (n < 100)

H: a = h();

if (n > 200)

T: t(a);

}

I = { H(i) : i ≥ 0; T(i) : i ≥ 0 }F H = { H(i)→ (N(i)→ n()) }V H = { H(i)→ (n) : i ≥ 0 ∧ n < 100 }F T = { T(i)→ (N(i)→ n()) }V T = { T(i)→ (n) : i ≥ 0 ∧ n > 200 }

Is there any dataflow between potential source and sink at inner level?

M = { T(i)→ H(i) }

F H ◦M ⊆ F T

⇒ filter elements accessed by any potential source instance associated tosink instance forms subset of filter elements accessed by sink instance

⇒ constraints on filter values at sink also apply at corresponding potentialsource: V T ◦M−1 = { H(i)→ (n) : i ≥ 0 ∧ n > 200 }

(V T ◦M−1

)∩ V H = ∅

⇒ there can be no dataflow at inner level

potential source

sink

Page 193: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 69 / 121

Parametric Array Dataflow Analysis

while (1) {

N: n = f();

a = g();

if (n < 100)

H: a = h();

if (n > 200)

T: t(a);

}

I = { H(i) : i ≥ 0; T(i) : i ≥ 0 }F H = { H(i)→ (N(i)→ n()) }V H = { H(i)→ (n) : i ≥ 0 ∧ n < 100 }F T = { T(i)→ (N(i)→ n()) }V T = { T(i)→ (n) : i ≥ 0 ∧ n > 200 }

Is there any dataflow between potential source and sink at inner level?

M = { T(i)→ H(i) }F H ◦M ⊆ F T

⇒ filter elements accessed by any potential source instance associated tosink instance forms subset of filter elements accessed by sink instance

⇒ constraints on filter values at sink also apply at corresponding potentialsource: V T ◦M−1 = { H(i)→ (n) : i ≥ 0 ∧ n > 200 }(

V T ◦M−1)∩ V H = ∅

⇒ there can be no dataflow at inner level

potential source

sink

Page 194: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 69 / 121

Parametric Array Dataflow Analysis

while (1) {

N: n = f();

a = g();

if (n < 100)

H: a = h();

if (n > 200)

T: t(a);

}

I = { H(i) : i ≥ 0; T(i) : i ≥ 0 }F H = { H(i)→ (N(i)→ n()) }V H = { H(i)→ (n) : i ≥ 0 ∧ n < 100 }F T = { T(i)→ (N(i)→ n()) }V T = { T(i)→ (n) : i ≥ 0 ∧ n > 200 }

Is there any dataflow between potential source and sink at inner level?

M = { T(i)→ H(i) }F H ◦M ⊆ F T

⇒ filter elements accessed by any potential source instance associated tosink instance forms subset of filter elements accessed by sink instance

⇒ constraints on filter values at sink also apply at corresponding potentialsource: V T ◦M−1 = { H(i)→ (n) : i ≥ 0 ∧ n > 200 }

(V T ◦M−1

)∩ V H = ∅

⇒ there can be no dataflow at inner level

potential source

sink

Page 195: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 69 / 121

Parametric Array Dataflow Analysis

while (1) {

N: n = f();

a = g();

if (n < 100)

H: a = h();

if (n > 200)

T: t(a);

}

I = { H(i) : i ≥ 0; T(i) : i ≥ 0 }F H = { H(i)→ (N(i)→ n()) }V H = { H(i)→ (n) : i ≥ 0 ∧ n < 100 }F T = { T(i)→ (N(i)→ n()) }V T = { T(i)→ (n) : i ≥ 0 ∧ n > 200 }

Is there any dataflow between potential source and sink at inner level?

M = { T(i)→ H(i) }F H ◦M ⊆ F T

⇒ filter elements accessed by any potential source instance associated tosink instance forms subset of filter elements accessed by sink instance

⇒ constraints on filter values at sink also apply at corresponding potentialsource: V T ◦M−1 = { H(i)→ (n) : i ≥ 0 ∧ n > 200 }(

V T ◦M−1)∩ V H = ∅

⇒ there can be no dataflow at inner level

potential source

sink

Page 196: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 70 / 121

Polyhedral Process Networks [19]

Main purpose: extract task level parallelism from dataflow graph

statement → processflow dependence → communication channel

⇒ requires dataflow analysis

Processes are mapped to parallel hardware (e.g., FPGA)

Example:

for (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

for (int i = 0; i < n; ++i)

write(fifo , f1(A[i]));

for (int i = 0; i < n; ++i)

B[i] = f2(read(fifo ));

Page 197: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 70 / 121

Polyhedral Process Networks [19]

Main purpose: extract task level parallelism from dataflow graph

statement → processflow dependence → communication channel

⇒ requires dataflow analysis

Processes are mapped to parallel hardware (e.g., FPGA)

Example:

for (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

for (int i = 0; i < n; ++i)

write(fifo , f1(A[i]));

for (int i = 0; i < n; ++i)

B[i] = f2(read(fifo ));

Page 198: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 70 / 121

Polyhedral Process Networks [19]

Main purpose: extract task level parallelism from dataflow graph

statement → processflow dependence → communication channel

⇒ requires dataflow analysis

Processes are mapped to parallel hardware (e.g., FPGA)

Example:

for (int i = 0; i < n; ++i) {

S: t = f1(A[i]);

T: B[i] = f2(t);

}

for (int i = 0; i < n; ++i)

write(fifo , f1(A[i]));

for (int i = 0; i < n; ++i)

B[i] = f2(read(fifo ));

Page 199: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 71 / 121

Process Networks with Dynamic Control

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow:

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0;

S5(i)→ S6(i) : βS5S6 = 1; S4(i)→ S5(i) }

f1

out_1ND_0

in_0ND_1

ED_1

f2

out_1ND_2

in_2ND_5

ED_2

f3

out_1ND_3

in_0ND_4

ED_0

f4

out_2ND_4dc0_ND_4_b

in_0ND_5

ED_3dc0_ND_5_b

CED_4

in_0ND_5

Page 200: Polyhedral Compilation without Polyhedra - Inria

Dataflow Run-time dependent Dataflow January 20, 2015 71 / 121

Process Networks with Dynamic Control

for (int i = 0; i < n; ++i) {

S1: t = f1(i);

S2: A[i] = t;

S3: t = f2(i);

S4: if (f3(i))

S5: t = f4(i);

S6: B[i] = t;

}

Run-time dependent dataflow:

{ S1(i)→ S2(i); S3(i)→ S6(i) : βS5S6 = 0;

S5(i)→ S6(i) : βS5S6 = 1; S4(i)→ S5(i) }

f1

out_1ND_0

in_0ND_1

ED_1

f2

out_1ND_2

in_2ND_5

ED_2

f3

out_1ND_3

in_0ND_4

ED_0

f4

out_2ND_4dc0_ND_4_b

in_0ND_5

ED_3dc0_ND_5_b

CED_4

in_0ND_5

Page 201: Polyhedral Compilation without Polyhedra - Inria

Dataflow Reductions January 20, 2015 72 / 121

Reductions

A: s = 0;

for (int i = 0; i < n; ++i)

B: s += A[i];

C: B[0] = s;

Dataflow:

{ A()→ B(0); B(i)→ B(i + 1) : 0 ≤ i < n − 1; B(n − 1)→ C() }⇒ fixes order of reduction

Allow reordering of updates:

read in C should depend on all updates in B

⇒ updates should not kill dependences⇒ remove updates from must-writes

updates should not depend on each other

⇒ remove false dependences between updates that flow to the same read,provided the read does not also depend on intermediate writes

Dataflow: { A()→ B(i) : 0 ≤ i < n; B(i)→ C() : 0 ≤ i < n }

Page 202: Polyhedral Compilation without Polyhedra - Inria

Dataflow Reductions January 20, 2015 72 / 121

Reductions

A: s = 0;

for (int i = 0; i < n; ++i)

B: s += A[i];

C: B[0] = s;

Dataflow:

{ A()→ B(0); B(i)→ B(i + 1) : 0 ≤ i < n − 1; B(n − 1)→ C() }⇒ fixes order of reduction

Allow reordering of updates:

read in C should depend on all updates in B

⇒ updates should not kill dependences⇒ remove updates from must-writes

updates should not depend on each other

⇒ remove false dependences between updates that flow to the same read,provided the read does not also depend on intermediate writes

Dataflow: { A()→ B(i) : 0 ≤ i < n; B(i)→ C() : 0 ≤ i < n }

Page 203: Polyhedral Compilation without Polyhedra - Inria

Dataflow Reductions January 20, 2015 72 / 121

Reductions

A: s = 0;

for (int i = 0; i < n; ++i)

B: s += A[i];

C: B[0] = s;

Dataflow:

{ A()→ B(0); B(i)→ B(i + 1) : 0 ≤ i < n − 1; B(n − 1)→ C() }⇒ fixes order of reduction

Allow reordering of updates:

read in C should depend on all updates in B

⇒ updates should not kill dependences⇒ remove updates from must-writes

updates should not depend on each other

⇒ remove false dependences between updates that flow to the same read,provided the read does not also depend on intermediate writes

Dataflow: { A()→ B(i) : 0 ≤ i < n; B(i)→ C() : 0 ≤ i < n }

Page 204: Polyhedral Compilation without Polyhedra - Inria

Aliasing January 20, 2015 73 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 205: Polyhedral Compilation without Polyhedra - Inria

Aliasing January 20, 2015 74 / 121

Aliasing

Some possible ways of handling aliasing:

use an input language that does not permit aliasing

pretend the problem does not exist

require user to ensure absence of aliasing⇒ e.g., use restrict keyword

handle as may-write⇒ may lead to too many dependences

check aliasing at run-time⇒ use original code in case of aliasing

Page 206: Polyhedral Compilation without Polyhedra - Inria

Counting January 20, 2015 75 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 207: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 76 / 121

CardinalityCardinality of a set

⇒ number of elements in the set⇒ may depend on symbolic constants

S = {S(i) : f (i) }

cardS = { n : n = #i : f (i) } card

(⋃i

Si

):=∑i

cardSi

card { A(i) : 0 ≤ i ≤ n; B() } = n + 2

Cardinality of a binary relation

⇒ for each domain element, number of corresponding images

R = { S(i)→ T (j) : f (i, j) }

cardR = { S(i)→ n : n = #j : f (i, j) } card

(⋃i

Ri

):=∑i

cardRi

R = { A(i)→ C(i) : 0 ≤ i ≤ n; B()→ C(i) : 0 ≤ i ≤ n }cardR = { A(i)→ 1 : 0 ≤ i ≤ n; B()→ n + 1 }

⇒ not a Presburger formula

Page 208: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 77 / 121

Cardinality Examples

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

a[i+j] = f(a[i+j]);

How many times is the statement executed?

card{ (i , j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }⇒ { N+N2

2 : N ≥ 1 }

How many times is a given array element written?

card ({ (i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i })−1

⇒ { a(a)→ 1 + a : 0 ≤ a < N }How many array elements are written?

card (ran { (i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i })⇒ {N : N ≥ 1 }

Page 209: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 77 / 121

Cardinality Examples

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

a[i+j] = f(a[i+j]);

How many times is the statement executed?

card{ (i , j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }⇒ { N+N2

2 : N ≥ 1 }How many times is a given array element written?

card ({ (i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i })−1

⇒ { a(a)→ 1 + a : 0 ≤ a < N }

How many array elements are written?

card (ran { (i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i })⇒ {N : N ≥ 1 }

Page 210: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 77 / 121

Cardinality Examples

for (i = 0; i < N; ++i)

for (j = 0; j < N - i; ++j)

a[i+j] = f(a[i+j]);

How many times is the statement executed?

card{ (i , j) : 0 ≤ i < N ∧ 0 ≤ j < N − i }⇒ { N+N2

2 : N ≥ 1 }How many times is a given array element written?

card ({ (i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i })−1

⇒ { a(a)→ 1 + a : 0 ≤ a < N }How many array elements are written?

card (ran { (i , j)→ a(i + j) : 0 ≤ i < N ∧ 0 ≤ j < N − i })⇒ {N : N ≥ 1 }

Page 211: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 78 / 121

Cardinality Examples (2)How many times is S1 executed ?

for (i = max(0,N-M); i <= N-M+3; i++)

for (j = 0; j <= N-2*i; j++)

S1;

card{ (i , j) : 0,N −M ≤ i ≤ N −M + 3 ∧ 0 ≤ j ≤ N − 2i }−4N + 8M − 8 if M ≤ N ≤ 2M − 6

MN − 2N −M2 + 6M − 8 if N ≤ M ≤ N + 3 ∧ N ≤ 2M − 6N2

4 + 34 N + 1

2

⌊N2

⌋+ 1 if 0 ≤ N ≤ M ∧ 2M ≤ N + 6

N2

4 −MN − 54 N + M2 + 2M + 1

2

⌊N2

⌋+ 1 if M ≤ N ≤ 2M ≤ N + 6

N

M

Page 212: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 78 / 121

Cardinality Examples (2)How many times is S1 executed ?

for (i = max(0,N-M); i <= N-M+3; i++)

for (j = 0; j <= N-2*i; j++)

S1;

card{ (i , j) : 0,N −M ≤ i ≤ N −M + 3 ∧ 0 ≤ j ≤ N − 2i }−4N + 8M − 8 if M ≤ N ≤ 2M − 6

MN − 2N −M2 + 6M − 8 if N ≤ M ≤ N + 3 ∧ N ≤ 2M − 6N2

4 + 34 N + 1

2

⌊N2

⌋+ 1 if 0 ≤ N ≤ M ∧ 2M ≤ N + 6

N2

4 −MN − 54 N + M2 + 2M + 1

2

⌊N2

⌋+ 1 if M ≤ N ≤ 2M ≤ N + 6

N

M

Page 213: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 79 / 121

Cardinality RepresentationInteger quasi affine expression bx/2c+ 3N⇒ Presburger term

That is, a term constructed from variables, symbolic constants,integer constants, addition (+), subtraction (−) andinteger division by a constant (b·/dc)

Rational polynomial expression x2 − N/2⇒ a term constructed from variables, symbolic constants,

rational constants, addition (+), subtraction (−) and multiplication (·)Quasi polynomial expression (bx/2c+ 3N)2 − N/2⇒ a rational polynomial expression with variables replaced by

integer quasi affine expressionsquasi affine/polynomial expression⇒ a list of pairs of pairs of Presburger sets and quasi affine/polynomial

expressions E = (Si , ei )i , with Si disjoint

E (j) =

{ei (j) if j ∈ Si

⊥/0 otherwise

Note: in practice, cardinality result does not contain nested integerdivisions

Page 214: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 79 / 121

Cardinality RepresentationInteger quasi affine expression bx/2c+ 3N⇒ Presburger term

That is, a term constructed from variables, symbolic constants,integer constants, addition (+), subtraction (−) andinteger division by a constant (b·/dc)

Rational polynomial expression x2 − N/2⇒ a term constructed from variables, symbolic constants,

rational constants, addition (+), subtraction (−) and multiplication (·)

Quasi polynomial expression (bx/2c+ 3N)2 − N/2⇒ a rational polynomial expression with variables replaced by

integer quasi affine expressionsquasi affine/polynomial expression⇒ a list of pairs of pairs of Presburger sets and quasi affine/polynomial

expressions E = (Si , ei )i , with Si disjoint

E (j) =

{ei (j) if j ∈ Si

⊥/0 otherwise

Note: in practice, cardinality result does not contain nested integerdivisions

Page 215: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 79 / 121

Cardinality RepresentationInteger quasi affine expression bx/2c+ 3N⇒ Presburger term

That is, a term constructed from variables, symbolic constants,integer constants, addition (+), subtraction (−) andinteger division by a constant (b·/dc)

Rational polynomial expression x2 − N/2⇒ a term constructed from variables, symbolic constants,

rational constants, addition (+), subtraction (−) and multiplication (·)Quasi polynomial expression (bx/2c+ 3N)2 − N/2⇒ a rational polynomial expression with variables replaced by

integer quasi affine expressions

quasi affine/polynomial expression⇒ a list of pairs of pairs of Presburger sets and quasi affine/polynomial

expressions E = (Si , ei )i , with Si disjoint

E (j) =

{ei (j) if j ∈ Si

⊥/0 otherwise

Note: in practice, cardinality result does not contain nested integerdivisions

Page 216: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 79 / 121

Cardinality RepresentationInteger quasi affine expression bx/2c+ 3N⇒ Presburger term

That is, a term constructed from variables, symbolic constants,integer constants, addition (+), subtraction (−) andinteger division by a constant (b·/dc)

Rational polynomial expression x2 − N/2⇒ a term constructed from variables, symbolic constants,

rational constants, addition (+), subtraction (−) and multiplication (·)Quasi polynomial expression (bx/2c+ 3N)2 − N/2⇒ a rational polynomial expression with variables replaced by

integer quasi affine expressionsPiecewise quasi affine/polynomial expression⇒ a list of pairs of pairs of Presburger sets and quasi affine/polynomial

expressions E = (Si , ei )i , with Si disjoint

E (j) =

{ei (j) if j ∈ Si

⊥/0 otherwise

Note: in practice, cardinality result does not contain nested integerdivisions

Page 217: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 79 / 121

Cardinality RepresentationInteger quasi affine expression bx/2c+ 3N⇒ Presburger term

That is, a term constructed from variables, symbolic constants,integer constants, addition (+), subtraction (−) andinteger division by a constant (b·/dc)

Rational polynomial expression x2 − N/2⇒ a term constructed from variables, symbolic constants,

rational constants, addition (+), subtraction (−) and multiplication (·)Quasi polynomial expression (bx/2c+ 3N)2 − N/2⇒ a rational polynomial expression with variables replaced by

integer quasi affine expressionsPiecewise quasi affine/polynomial expression⇒ a list of pairs of pairs of Presburger sets and quasi affine/polynomial

expressions E = (Si , ei )i , with Si disjoint

E (j) =

{ei (j) if j ∈ Si

⊥/0 otherwise

Note: in practice, cardinality result does not contain nested integerdivisions

Page 218: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 80 / 121

Basic Operations on Piecewise Expressions

Piecewise (rational) quasi affine expressionsI addition (+)I subtraction (−)I negation (−)I minimum (min), maximum (max)I multiplication by constant (·d)I division by constant (/d)I remainder on integer division by constant (modd)I floor (b·c)I ceiling (d·e)

Piecewise quasi polynomial expressionsI addition (+)I subtraction (−)I negation (−)I multiplication (·)I exponentiation by positive integer constant (·d)

Page 219: Polyhedral Compilation without Polyhedra - Inria

Counting Cardinality January 20, 2015 80 / 121

Basic Operations on Piecewise Expressions

Piecewise (rational) quasi affine expressionsI addition (+)I subtraction (−)I negation (−)I minimum (min), maximum (max)I multiplication by constant (·d)I division by constant (/d)I remainder on integer division by constant (modd)I floor (b·c)I ceiling (d·e)

Piecewise quasi polynomial expressionsI addition (+)I subtraction (−)I negation (−)I multiplication (·)I exponentiation by positive integer constant (·d)

Page 220: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2

≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 221: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2

≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 222: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2

≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 223: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

N

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2

≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 224: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

N

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2

≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 225: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

N

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2

≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 226: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 81 / 121

Bounds on Piecewise Quasi Polynomials

05 0

5

0

10

xy4

+x

+y−

(x−

2)2

N

m(N) = max(x ,y):x ,y≥0∧x+y≤N

4+x +y−(x−2)2 ≤ u(N) = max(3N, 5N − N2)

Question

Can exact maximum be computed in general?

Upper bound u(N) ≥ m(N) can be computed

Page 227: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 82 / 121

Bounds on Piecewise Quasi Polynomials — Example

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j) {

p = malloc(i * j + i - N + 1);

/* ... */

free(p);

}

How much memory is needed?

ub{

i j + i − N + 1 if 0 ≤ i < N ∧ i ≤ j < N

Result: {max(1− 2N + N2) if N ≥ 1

(exact maximum)

Page 228: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 82 / 121

Bounds on Piecewise Quasi Polynomials — Example

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j) {

p = malloc(i * j + i - N + 1);

/* ... */

free(p);

}

How much memory is needed?

ub{

i j + i − N + 1 if 0 ≤ i < N ∧ i ≤ j < N

Result: {max(1− 2N + N2) if N ≥ 1

(exact maximum)

Page 229: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 82 / 121

Bounds on Piecewise Quasi Polynomials — Example

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j) {

p = malloc(i * j + i - N + 1);

/* ... */

free(p);

}

How much memory is needed?

ub{

i j + i − N + 1 if 0 ≤ i < N ∧ i ≤ j < N

Result: {max(1− 2N + N2) if N ≥ 1

(exact maximum)

Page 230: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 83 / 121

Maximal Number of Live Memory elements

Assume each statement instance writes to at most one array element

⇒ Each live element can be identified by write instance

Compute dataflow relation D

For each write instance compute last readL = O−1 ◦ lexmax(O ◦ D)

For each statement instance i, count write instances that precede isuch that corresponding last read follows i⇒ Number of live elements at i

N = card(((O�O) ∩ran (dom L)) ∩

(L−1 ◦ (O 4O)

))Compute upper boundU = ubN

Page 231: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 84 / 121

Maximal Number of Live Memory elements — Example

for (i = 0; i < N; ++i)

S1: t[i] = f(a[i]);

for (i = 0; i < N; ++i)

S2: b[i] = g(t[N-i -1]);

I = { S1(i) : 0 ≤ i < N;S2(i) : 0 ≤ i < N }O = { S1(i)→ (0, i);S2(i)→ (1, i) }D = { S1(i)→ S2(N − 1− i) : 0 ≤ i < N }

L = O−1 ◦ lexmax(O ◦ D)

= {S1(i)→ S2(N − 1− i) : 0 ≤ i < N }N = card

(((O�O) ∩ran (dom L)) ∩

(L−1 ◦ (O 4O)

))= { S1(i)→ i : 1 ≤ i < N;S2(i)→ N − i : 0 ≤ i < N }

U = ubN

= {max(N) : N ≥ 1 }

Page 232: Polyhedral Compilation without Polyhedra - Inria

Counting Bounds January 20, 2015 84 / 121

Maximal Number of Live Memory elements — Example

for (i = 0; i < N; ++i)

S1: t[i] = f(a[i]);

for (i = 0; i < N; ++i)

S2: b[i] = g(t[N-i -1]);

I = { S1(i) : 0 ≤ i < N;S2(i) : 0 ≤ i < N }O = { S1(i)→ (0, i);S2(i)→ (1, i) }D = { S1(i)→ S2(N − 1− i) : 0 ≤ i < N }L = O−1 ◦ lexmax(O ◦ D)

= { S1(i)→ S2(N − 1− i) : 0 ≤ i < N }N = card

(((O�O) ∩ran (dom L)) ∩

(L−1 ◦ (O 4O)

))= { S1(i)→ i : 1 ≤ i < N;S2(i)→ N − i : 0 ≤ i < N }

U = ubN

= {max(N) : N ≥ 1 }

Page 233: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 85 / 121

Weighted Counting

G = F ◦ R

=

{(x , y)→ x2 + y 2

4: 1 ≤ x , y ≤ 2

}◦ { (x)→ (x , y) }

=

{(x)→ 5 + 2x2

2: 1 ≤ x ≤ 2

}

with F a piecewise quasi polynomial and R a Presburger relationis a piecewise quasi polynomial G such that

G (i) =∑

j:R(i,j)

F (j)

y

x2+y2

4

xx

5+2x2

4

R = { (x)→ (x , y) }

Page 234: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 85 / 121

Weighted Counting

G = F ◦ R =

{(x , y)→ x2 + y 2

4: 1 ≤ x , y ≤ 2

}◦ { (x)→ (x , y) }

=

{(x)→ 5 + 2x2

2: 1 ≤ x ≤ 2

}

with F a piecewise quasi polynomial and R a Presburger relationis a piecewise quasi polynomial G such that

G (i) =∑

j:R(i,j)

F (j)

y

x2+y2

4

xx

5+2x2

4

R = { (x)→ (x , y) }

Page 235: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 85 / 121

Weighted Counting

G = F ◦ R =

{(x , y)→ x2 + y 2

4: 1 ≤ x , y ≤ 2

}◦ { (x)→ (x , y) }

=

{(x)→ 5 + 2x2

2: 1 ≤ x ≤ 2

}

with F a piecewise quasi polynomial and R a Presburger relationis a piecewise quasi polynomial G such that

G (i) =∑

j:R(i,j)

F (j)

y

x2+y2

4

xx

5+2x2

4

R = { (x)→ (x , y) }

Page 236: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 85 / 121

Weighted Counting

G = F ◦ R =

{(x , y)→ x2 + y 2

4: 1 ≤ x , y ≤ 2

}◦ { (x)→ (x , y) }

=

{(x)→ 5 + 2x2

2: 1 ≤ x ≤ 2

}with F a piecewise quasi polynomial and R a Presburger relationis a piecewise quasi polynomial G such that

G (i) =∑

j:R(i,j)

F (j)

y

x2+y2

4

xx

5+2x2

4 R = { (x)→ (x , y) }

Page 237: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 86 / 121

Compositions with Piecewise (Folds of) Quasi polynomials

F ◦ R

or F (S)

R: D1 → D2 is a Presburger relation

S ⊆ D2 is a Presburger set

F : D2 → Q may beI piecewise quasi polynomial

(result of counting problem)

⇒ take sum over (ranR) ∩ (domF )

or S ∩ (domF )

I piecewise fold of quasi polynomials(result of upper bound computation)

⇒ compute bound over (ranR) ∩ (domF )

or S ∩ (domF )

(F ◦ R): D1 → Q of same type as F

F (S): Q of same type as F

if R is single-valued, then sum/bound is computed over a single point

Page 238: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 86 / 121

Compositions with Piecewise (Folds of) Quasi polynomials

F ◦ R or F (S)

R: D1 → D2 is a Presburger relation

S ⊆ D2 is a Presburger set

F : D2 → Q may beI piecewise quasi polynomial

(result of counting problem)

⇒ take sum over (ranR) ∩ (domF ) or S ∩ (domF )

I piecewise fold of quasi polynomials(result of upper bound computation)

⇒ compute bound over (ranR) ∩ (domF ) or S ∩ (domF )

(F ◦ R): D1 → Q of same type as F

F (S): Q of same type as F

if R is single-valued, then sum/bound is computed over a single point

Page 239: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 87 / 121

Example: Total Memory Allocation

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j)

p[i][j] = malloc(i * j + i - N + 1);

/* ... */

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j)

free(p[i][j]);

How much memory allocated in total?

F = { (i , j)→ i j + i − N + 1 }I = { (i , j) : 0 ≤ i < N ∧ i ≤ j < N }

F (I ) =

{5

12N − 1

8N2 − 5

12N3 +

1

8N4 : N ≥ 1

}

Page 240: Polyhedral Compilation without Polyhedra - Inria

Counting Weighted Counting January 20, 2015 87 / 121

Example: Total Memory Allocation

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j)

p[i][j] = malloc(i * j + i - N + 1);

/* ... */

for (i = 0; i < N; ++i)

for (j = i; j < N; ++j)

free(p[i][j]);

How much memory allocated in total?

F = { (i , j)→ i j + i − N + 1 }I = { (i , j) : 0 ≤ i < N ∧ i ≤ j < N }

F (I ) =

{5

12N − 1

8N2 − 5

12N3 +

1

8N4 : N ≥ 1

}

Page 241: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 88 / 121

Dynamic Memory Requirement Estimation

How much memory is needed to execute the following program?

void m0(int m) {

for (c = 0; c < m; c++) {

m1(c); /*S1*/

B[] m2Arr = m2(2*m-c);/*S2*/

}

}

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /*S3*/

B[] dummyArr = m2(i); /*S4*/

}

}

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

I = { m0(m)→ S1(c) : 0 ≤ c < m;

m0(m)→ S2(c) : 0 ≤ c < m;

m1(k)→ S3(i) : 1 ≤ i ≤ k;

m1(k)→ S4(i) : 1 ≤ i ≤ k;

m2(n)→ S5();

m2(n)→ S6(j) : 1 ≤ j ≤ n }

Bm0 = { (m0(m)→ S1(c))→ m1(c);

(m0(m)→ S2(c))→ m2(2m − c) }Bm1 ={ (m1(k)→ S4(i))→ m2(i) }

[7]

Page 242: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 88 / 121

Dynamic Memory Requirement Estimation

How much memory is needed to execute the following program?

void m0(int m) {

for (c = 0; c < m; c++) {

m1(c); /*S1*/

B[] m2Arr = m2(2*m-c);/*S2*/

}

}

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /*S3*/

B[] dummyArr = m2(i); /*S4*/

}

}

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

I = { m0(m)→ S1(c) : 0 ≤ c < m;

m0(m)→ S2(c) : 0 ≤ c < m;

m1(k)→ S3(i) : 1 ≤ i ≤ k;

m1(k)→ S4(i) : 1 ≤ i ≤ k;

m2(n)→ S5();

m2(n)→ S6(j) : 1 ≤ j ≤ n }

Bm0 = { (m0(m)→ S1(c))→ m1(c);

(m0(m)→ S2(c))→ m2(2m − c) }Bm1 ={ (m1(k)→ S4(i))→ m2(i) }

[7]

Page 243: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 89 / 121

Dynamic Memory Requirement EstimationHow much (scoped) memory is needed?

⇒ compute for each methodretm size of memory returned by m

capm size of memory “captured” (not returned) by m

memRqm total memory requirements of m

retm + capm =∑

p called by m

retp

memRqm = capm + maxp called by m

memRqp

⇒ summarize over iteration domain, i.e., compose with M = (dom−−→ I )−1

M = { m0(m)→ (m0(m)→ S1(c)) : 0 ≤ c < m;

m0(m)→ (m0(m)→ S2(c)) : 0 ≤ c < m;

m1(k)→ (m1(k)→ S3(i)) : 1 ≤ i ≤ k ;

m1(k)→ (m1(k)→ S4(i)) : 1 ≤ i ≤ k ;

m2(n)→ (m2(n)→ S5()); m2(n)→ (m2(n)→ S6(j)) : 1 ≤ j ≤ n}

[7]

Page 244: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 89 / 121

Dynamic Memory Requirement EstimationHow much (scoped) memory is needed?

⇒ compute for each methodretm size of memory returned by m

capm size of memory “captured” (not returned) by m

memRqm total memory requirements of m

retm + capm =∑

p called by m

retp

memRqm = capm + maxp called by m

memRqp

⇒ summarize over iteration domain, i.e., compose with M = (dom−−→ I )−1

M = { m0(m)→ (m0(m)→ S1(c)) : 0 ≤ c < m;

m0(m)→ (m0(m)→ S2(c)) : 0 ≤ c < m;

m1(k)→ (m1(k)→ S3(i)) : 1 ≤ i ≤ k ;

m1(k)→ (m1(k)→ S4(i)) : 1 ≤ i ≤ k ;

m2(n)→ (m2(n)→ S5()); m2(n)→ (m2(n)→ S6(j)) : 1 ≤ j ≤ n}

[7]

Page 245: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 90 / 121

Dynamic Memory Requirement Estimation

retm + capm =∑

p called by m

retp

memRqm = capm + maxp called by m

memRqp

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

retm2 = { (m2(n)→ S5())→ n : n ≥ 0 } ◦M

= { m2(n)→ n : n ≥ 0 }

capm2 = { (m2(n)→ S6(j))→ 1 } ◦M

= { m2(n)→ n : n ≥ 1 }

memRqm2 = capm2 + { m2(n)→ max(0) }

= { m2(n)→ max(n) : n ≥ 1 }

[7]

Page 246: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 90 / 121

Dynamic Memory Requirement Estimation

retm + capm =∑

p called by m

retp

memRqm = capm + maxp called by m

memRqp

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

retm2 = { (m2(n)→ S5())→ n : n ≥ 0 } ◦M

= { m2(n)→ n : n ≥ 0 }

capm2 = { (m2(n)→ S6(j))→ 1 } ◦M

= { m2(n)→ n : n ≥ 1 }

memRqm2 = capm2 + { m2(n)→ max(0) }

= { m2(n)→ max(n) : n ≥ 1 }

[7]

Page 247: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 90 / 121

Dynamic Memory Requirement Estimation

retm + capm =∑

p called by m

retp

memRqm = capm + maxp called by m

memRqp

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

retm2 = { (m2(n)→ S5())→ n : n ≥ 0 } ◦M

= { m2(n)→ n : n ≥ 0 }

capm2 = { (m2(n)→ S6(j))→ 1 } ◦M

= { m2(n)→ n : n ≥ 1 }

memRqm2 = capm2 + { m2(n)→ max(0) }

= { m2(n)→ max(n) : n ≥ 1 }

[7]

Page 248: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 90 / 121

Dynamic Memory Requirement Estimation

retm + capm =∑

p called by m

retp

memRqm = capm + maxp called by m

memRqp

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

retm2 = { (m2(n)→ S5())→ n : n ≥ 0 } ◦M = { m2(n)→ n : n ≥ 0 }capm2 = { (m2(n)→ S6(j))→ 1 } ◦M = { m2(n)→ n : n ≥ 1 }

memRqm2 = capm2 + { m2(n)→ max(0) } = { m2(n)→ max(n) : n ≥ 1 }

[7]

Page 249: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 91 / 121

Dynamic Memory Requirement Estimation

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /* S3 */

B[] dummyArr = m2(i); /* S4 */

}

}

capm1(k) =∑

1≤i≤k(1 + retm2(i))

retm2 is a function of the arguments of m2We want to use it as a function of the arguments and local variables of m1

⇒ define parameter binding

retm1 = { m1(k)→ 0 }capm1 =

({ (m1(k)→ S3(i))→ 1 }+ retm2 ◦ Bm1

)◦M

=

{m1(k)→ 3

2k +

1

2k2 : k ≥ 1

}

memRqm1 = capm1 +(memRqm2 ◦ Bm1 ◦M

)

=

{m1(k)→ max

(5

2k +

1

2k2

): k ≥ 1

}

[7]

Page 250: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 91 / 121

Dynamic Memory Requirement Estimation

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /* S3 */

B[] dummyArr = m2(i); /* S4 */

}

}

capm1(k) =∑

1≤i≤k(1 + retm2(i))

retm2 is a function of the arguments of m2We want to use it as a function of the arguments and local variables of m1

⇒ define parameter binding

retm1 = { m1(k)→ 0 }capm1 =

({ (m1(k)→ S3(i))→ 1 }+ retm2 ◦ Bm1

)◦M

=

{m1(k)→ 3

2k +

1

2k2 : k ≥ 1

}

memRqm1 = capm1 +(memRqm2 ◦ Bm1 ◦M

)

=

{m1(k)→ max

(5

2k +

1

2k2

): k ≥ 1

}

[7]

Page 251: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 92 / 121

Dynamic Memory Requirement Estimation

How much memory is needed to execute the following program?

void m0(int m) {

for (c = 0; c < m; c++) {

m1(c); /*S1*/

B[] m2Arr = m2(2*m-c);/*S2*/

}

}

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /*S3*/

B[] dummyArr = m2(i); /*S4*/

}

}

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

I = { m0(m)→ S1(c) : 0 ≤ c < m;

m0(m)→ S2(c) : 0 ≤ c < m;

m1(k)→ S3(i) : 1 ≤ i ≤ k;

m1(k)→ S4(i) : 1 ≤ i ≤ k;

m2(n)→ S5();

m2(n)→ S6(j) : 1 ≤ j ≤ n }

Bm0 = { (m0(m)→ S1(c))→ m1(c);

(m0(m)→ S2(c))→ m2(2m − c) }Bm1 ={ (m1(k)→ S4(i))→ m2(i) }

[7]

Page 252: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 92 / 121

Dynamic Memory Requirement Estimation

How much memory is needed to execute the following program?

void m0(int m) {

for (c = 0; c < m; c++) {

m1(c); /*S1*/

B[] m2Arr = m2(2*m-c);/*S2*/

}

}

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /*S3*/

B[] dummyArr = m2(i); /*S4*/

}

}

B[] m2(int n) {

B[] arrB = new B[n]; /*S5*/

for (j = 1; j <= n; j++)

B b = new B(); /*S6*/

return arrB;

}

I = { m0(m)→ S1(c) : 0 ≤ c < m;

m0(m)→ S2(c) : 0 ≤ c < m;

m1(k)→ S3(i) : 1 ≤ i ≤ k;

m1(k)→ S4(i) : 1 ≤ i ≤ k;

m2(n)→ S5();

m2(n)→ S6(j) : 1 ≤ j ≤ n }

Bm0 = { (m0(m)→ S1(c))→ m1(c);

(m0(m)→ S2(c))→ m2(2m − c) }Bm1 ={ (m1(k)→ S4(i))→ m2(i) }

[7]

Page 253: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 93 / 121

Dynamic Memory Requirement Estimation

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /* S3 */

B[] dummyArr = m2(i); /* S4 */

}

}

capm1(k) =∑

1≤i≤k(1 + retm2(i))

retm2 is a function of the arguments of m2We want to use it as a function of the arguments and local variables of m1

⇒ define parameter binding

retm1 = { m1(k)→ 0 }capm1 =

({ (m1(k)→ S3(i))→ 1 }+ retm2 ◦ Bm1

)◦M

=

{m1(k)→ 3

2k +

1

2k2 : k ≥ 1

}

memRqm1 = capm1 +(memRqm2 ◦ Bm1 ◦M

)

=

{m1(k)→ max

(5

2k +

1

2k2

): k ≥ 1

}

[7]

Page 254: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 93 / 121

Dynamic Memory Requirement Estimation

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /* S3 */

B[] dummyArr = m2(i); /* S4 */

}

}

capm1(k) =∑

1≤i≤k(1 + retm2(i))

retm1 = { m1(k)→ 0 }capm1 =

({ (m1(k)→ S3(i))→ 1 }+ retm2 ◦ Bm1

)◦M

=

{m1(k)→ 3

2k +

1

2k2 : k ≥ 1

}

memRqm1 = capm1 +(memRqm2 ◦ Bm1 ◦M

)

=

{m1(k)→ max

(5

2k +

1

2k2

): k ≥ 1

}

[7]

Page 255: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 93 / 121

Dynamic Memory Requirement Estimation

void m1(int k) {

for (i = 1; i <= k; i++) {

A a = new A(); /* S3 */

B[] dummyArr = m2(i); /* S4 */

}

}

capm1(k) =∑

1≤i≤k(1 + retm2(i))

retm1 = { m1(k)→ 0 }capm1 =

({ (m1(k)→ S3(i))→ 1 }+ retm2 ◦ Bm1

)◦M

=

{m1(k)→ 3

2k +

1

2k2 : k ≥ 1

}memRqm1 = capm1 +

(memRqm2 ◦ Bm1 ◦M

)=

{m1(k)→ max

(5

2k +

1

2k2

): k ≥ 1

}

[7]

Page 256: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 94 / 121

Dynamic Memory Requirement Estimation

void m0(int m) {

for (c = 0; c < m; c++) {

m1(c); /* S1 */

B[] m2Arr = m2(2 * m - c); /* S2 */

}

}

Bm0 = { (m0(k)→ S1(c))→ m1(c); (m0(k)→ S2(c))→ m2(2m − c) }retm0 = { m0(m)→ 0 }capm0 = (retm1 + retm2) ◦ Bm0 ◦M

=

{m0(m)→ 1

2m +

3

2m2 : m ≥ 1

}

memRqm0 = capm0 +((memRqm1 + memRqm2) ◦ Bm0 ◦M

)

=

{m0(m)→ max

(−2 + 2m + 2m2,

5

2m +

3

2m2

): m ≥ 1

}

[7]

Page 257: Polyhedral Compilation without Polyhedra - Inria

Counting Dynamic Memory Requirement January 20, 2015 94 / 121

Dynamic Memory Requirement Estimation

void m0(int m) {

for (c = 0; c < m; c++) {

m1(c); /* S1 */

B[] m2Arr = m2(2 * m - c); /* S2 */

}

}

Bm0 = { (m0(k)→ S1(c))→ m1(c); (m0(k)→ S2(c))→ m2(2m − c) }retm0 = { m0(m)→ 0 }capm0 = (retm1 + retm2) ◦ Bm0 ◦M

=

{m0(m)→ 1

2m +

3

2m2 : m ≥ 1

}memRqm0 = capm0 +

((memRqm1 + memRqm2) ◦ Bm0 ◦M

)=

{m0(m)→ max

(−2 + 2m + 2m2,

5

2m +

3

2m2

): m ≥ 1

}

[7]

Page 258: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 95 / 121

Outline

1 Polyhedral ModelIntroductionRepresentation

2 Polyhedral TransformationSchedulesAST Generation

3 DependencesSchedule ValidityDependencesStructures

4 DataflowParallelismDataflowApproximate DataflowRun-time dependent DataflowReductions

5 Aliasing6 Counting

CardinalityBoundsWeighted CountingDynamic Memory Requirement

7 Transitive Closures

Page 259: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 96 / 121

Positive Powers

Definition (Power of a Relation)

Let R be a Presburger relation and k a positive integer, then power k ofrelation R is defined as

Rk :=

{R if k = 1

R ◦ Rk−1 if k ≥ 2.

Example

R = { (x)→ (x + 1) }Rk = { (x)→ (x + k) : k ≥ 1 }

Page 260: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 96 / 121

Positive Powers

Definition (Power of a Relation)

Let R be a Presburger relation and k a positive integer, then power k ofrelation R is defined as

Rk :=

{R if k = 1

R ◦ Rk−1 if k ≥ 2.

Example

R = { (x)→ (x + 1) }Rk = { (x)→ (x + k) : k ≥ 1 }

Page 261: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 97 / 121

Transitive Closures

Definition (Transitive Closure of a Relation)

Let R be a Presburger relation, then the transitive closure R+ of R is theunion of all positive powers of R,

R+ :=⋃k≥1

Rk .

Example

R = { (x)→ (x + 1) }Rk = { (x)→ (x + k) : k ≥ 1 }R+ = { (x)→ (y) : ∃k ≥ 1 : y = x + k } = { (x)→ (y) : y ≥ x + 1 }

Definition (Transitive Closure of a Relation, Alternative)

Inductive definition:R+ := R ∪

(R ◦ R+

)

Page 262: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 97 / 121

Transitive Closures

Definition (Transitive Closure of a Relation)

Let R be a Presburger relation, then the transitive closure R+ of R is theunion of all positive powers of R,

R+ :=⋃k≥1

Rk .

Example

R = { (x)→ (x + 1) }Rk = { (x)→ (x + k) : k ≥ 1 }R+ = { (x)→ (y) : ∃k ≥ 1 : y = x + k } = { (x)→ (y) : y ≥ x + 1 }

Definition (Transitive Closure of a Relation, Alternative)

Inductive definition:R+ := R ∪

(R ◦ R+

)

Page 263: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 97 / 121

Transitive Closures

Definition (Transitive Closure of a Relation)

Let R be a Presburger relation, then the transitive closure R+ of R is theunion of all positive powers of R,

R+ :=⋃k≥1

Rk .

Example

R = { (x)→ (x + 1) }Rk = { (x)→ (x + k) : k ≥ 1 }R+ = { (x)→ (y) : ∃k ≥ 1 : y = x + k } = { (x)→ (y) : y ≥ x + 1 }

Definition (Transitive Closure of a Relation, Alternative)

Inductive definition:R+ := R ∪

(R ◦ R+

)

Page 264: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 98 / 121

Transitive Closures — Approximation

Fact

Given a Presburger relation R, the power Rk (with k a parameter) andthe transitive closure R+ may not be Presburger relations.

Example

R = { (x)→ (2 x) }Rk = { (x)→ (2k x) }

⇒ need for approximationI overapproximation R+

I underapproximation R+

Note

Do not use transitive closures if there is an alternative.

[15, 22]

Page 265: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 98 / 121

Transitive Closures — Approximation

Fact

Given a Presburger relation R, the power Rk (with k a parameter) andthe transitive closure R+ may not be Presburger relations.

Example

R = { (x)→ (2 x) }Rk = { (x)→ (2k x) }

⇒ need for approximationI overapproximation R+

I underapproximation R+

Note

Do not use transitive closures if there is an alternative.

[15, 22]

Page 266: Polyhedral Compilation without Polyhedra - Inria

Transitive Closures January 20, 2015 98 / 121

Transitive Closures — Approximation

Fact

Given a Presburger relation R, the power Rk (with k a parameter) andthe transitive closure R+ may not be Presburger relations.

Example

R = { (x)→ (2 x) }Rk = { (x)→ (2k x) }

⇒ need for approximationI overapproximation R+

I underapproximation R+

Note

Do not use transitive closures if there is an alternative.

[15, 22]

Page 267: Polyhedral Compilation without Polyhedra - Inria

January 20, 2015 99 / 121

Part II

Tools

Page 268: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 100 / 121

Outline

8 Tools

Page 269: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 101 / 121

Availability — Representation [20]

{A(i) : 0 ≤ i ≤ n; B(i , j) : ∃α : i = 2α }

Named (and nested) spaces: isl

[n] -> { A[i]: 0 <= i <= n; B[i,j]: exists a: i = 2 a }

In omega(+):

symbolic n;

{ [0, i, 0]: 0 <= i <= n } union { [1, i,j]: exists a: i = 2 a }

A Bpadding

Presburger sets and relations: isl, omega(+)In PolyLib:2

4 6

0 -1 0 0 0 0

0 0 0 -1 0 0

1 0 1 0 0 0

1 0 -1 0 1 0

2 7

0 -1 0 0 0 0 -1

0 0 -1 0 2 0 0

Moreover: PolyLib deals with rational sets (polyhedra)

equality/inequality n

Page 270: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 101 / 121

Availability — Representation [20]

{A(i) : 0 ≤ i ≤ n; B(i , j) : ∃α : i = 2α }

Named (and nested) spaces: isl

[n] -> { A[i]: 0 <= i <= n; B[i,j]: exists a: i = 2 a }

In omega(+):symbolic n;

{ [0, i, 0]: 0 <= i <= n } union { [1, i,j]: exists a: i = 2 a }

A Bpadding

Presburger sets and relations: isl, omega(+)

In PolyLib:2

4 6

0 -1 0 0 0 0

0 0 0 -1 0 0

1 0 1 0 0 0

1 0 -1 0 1 0

2 7

0 -1 0 0 0 0 -1

0 0 -1 0 2 0 0

Moreover: PolyLib deals with rational sets (polyhedra)

equality/inequality n

Page 271: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 101 / 121

Availability — Representation [20]

{A(i) : 0 ≤ i ≤ n; B(i , j) : ∃α : i = 2α }

Named (and nested) spaces: isl

[n] -> { A[i]: 0 <= i <= n; B[i,j]: exists a: i = 2 a }

In omega(+):symbolic n;

{ [0, i, 0]: 0 <= i <= n } union { [1, i,j]: exists a: i = 2 a }

A Bpadding

Presburger sets and relations: isl, omega(+)

In PolyLib:2

4 6

0 -1 0 0 0 0

0 0 0 -1 0 0

1 0 1 0 0 0

1 0 -1 0 1 0

2 7

0 -1 0 0 0 0 -1

0 0 -1 0 2 0 0

Moreover: PolyLib deals with rational sets (polyhedra)

equality/inequality n

Page 272: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 101 / 121

Availability — Representation [20]

{A(i) : 0 ≤ i ≤ n; B(i , j) : ∃α : i = 2α }

Named (and nested) spaces: isl

[n] -> { A[i]: 0 <= i <= n; B[i,j]: exists a: i = 2 a }

In omega(+):symbolic n;

{ [0, i, 0]: 0 <= i <= n } union { [1, i,j]: exists a: i = 2 a }

A Bpadding

Presburger sets and relations: isl, omega(+)In PolyLib:2

4 6

0 -1 0 0 0 0

0 0 0 -1 0 0

1 0 1 0 0 0

1 0 -1 0 1 0

2 7

0 -1 0 0 0 0 -1

0 0 -1 0 2 0 0

Moreover: PolyLib deals with rational sets (polyhedra)

equality/inequality n

Page 273: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 102 / 121

Availability — Representation (2)

Uninterpreted functions: omega(+)

⇒ arity can be greater than 0⇒ not available in isl (yet)

Note: support in omega(+) for uninterpreted functions is veryrestrictive

I arguments need to be prefix of input/output dimensions

⇒ essentially symbolic constants

Page 274: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 103 / 121

Availability — Operations

domainrange

inverseunion intersection difference emptiness application

PolyLib � � in ZPPL � � �PIP lexminomega(+) � � � � �isl � � � � �barvinok cardbernstein

LattE card

Page 275: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 104 / 121

Availability — Operations (2)

bounds on weighted transitivelexmin cardinality polynomials counting closure

PolyLib �PPL �PIP �omega(+) Presburger “R+”

isl � � R+

barvinok partial � �bernstein �LattE � �

Page 276: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 105 / 121

isl and Related Libraries and Tools

LLVM imath GMP

clang isl NTL PolyLib

Polly pet barvinok

PPCG isa iscc

Licenses:BSD/MITLGPLGPL

isl: manipulates parametric affine sets and relationsbarvinok: counts elements in parametric affine sets and relationspet: extracts polyhedral model from clang ASTPPCG: Polyhedral Parallel Code Generatoriscc: interactive calculatorisa: prototype tool set including derivation of process networks andequivalence checker

Page 277: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 106 / 121

Overview of islisl is a thread-safe C library for manipulating integer sets and relations

bounded by affine constraints

involving symbolic constants and

existentially quantified variables

and quasi-affine and quasi-polynomial functions on such domains

Supported operations by core library include

intersectionunionset differenceinteger projectioncoalescingclosed convex hull

sampling, scanninginteger affine hulllexicographic optimizationtransitive closure (approx.)parametric vertex enumerationbounds on quasipolynomials

Polyhedral compilation libraryschedule trees

dataflow analysis

scheduling

AST generation

Page 278: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 107 / 121

PPCGPPCG (http://ppcg.gforge.inria.fr/)

Input: C code

Output: CUDA or OpenCL code for GPGPUs

Steps:

extract polyhedral model from C code (pet)

dependence analysis (isl)

schedulingI expose parallelism and tiling opportunities (isl)I perform tiling (isl)I separate into parts mapped to host, GPU blocks and GPU threads

memory managementI add transfers of data to/from GPUI detect array reference groupsI allocate groups to registers and shared memory

generate AST (isl)

Page 279: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 107 / 121

PPCGPPCG (http://ppcg.gforge.inria.fr/)

Input: C code

Output: CUDA or OpenCL code for GPGPUs

Steps:

extract polyhedral model from C code (pet)

dependence analysis (isl)

schedulingI expose parallelism and tiling opportunities (isl)I perform tiling (isl)I separate into parts mapped to host, GPU blocks and GPU threads

memory managementI add transfers of data to/from GPUI detect array reference groupsI allocate groups to registers and shared memory

generate AST (isl)

Page 280: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 108 / 121

PPCG Example — Input

void matmul(int M, int N, int K,

float A[static const restrict M][K],

float B[static const restrict K][N],

float C[static const restrict M][N])

{

for (int i = 0; i < M; i++)

for (int j = 0; j < N; j++) {

S1: C[i][j] = 0;

for (int k = 0; k < K; k++)

S2: C[i][j] = C[i][j] + A[i][k] * B[k][j];

}

}

Options:

--ctx="[M,N,K] -> { : M = N = K = 256 }"

--sizes="{ kernel[i] -> tile [16 ,16 ,16];

kernel[i] -> block [8,16] }"

--pet -autodetect

Page 281: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 109 / 121

PPCG Example — Outputlong b0 = blockIdx.y, b1 = blockIdx.x;

long t0 = threadIdx.y, t1 = threadIdx.x;

__shared__ float s_A [16][16];

float p_C [2][1];

__shared__ float s_B [16][16];

for (long g9 = 0; g9 <= 15; g9 += 1) {

for (long c0 = t0; c0 <= 15; c0 += 8)

s_B[c0][t1] = B[(16 * g9 + c0) * (256) + 16 * b1 + t1];

for (long c0 = t0; c0 <= 15; c0 += 8)

s_A[c0][t1] = A[(16 * b0 + c0) * (256) + t1 + 16 * g9];

__syncthreads ();

if (g9 == 0) {

p_C [0][0] = (0);

p_C [1][0] = (0);

}

for (long c3 = 0; c3 <= 15; c3 += 1) {

p_C [0][0] = (p_C [0][0] + (s_A[t0][c3] * s_B[c3][t1]));

p_C [1][0] = (p_C [1][0] + (s_A[t0 + 8][c3] * s_B[c3][t1]));

}

__syncthreads ();

}

C[(16 * b0 + t0) * (256) + 16 * b1 + t1] = p_C [0][0];

C[(16 * b0 + t0 + 8) * (256) + 16 * b1 + t1] = p_C [1][0];

Page 282: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 110 / 121

CARP ProjectDesign tools and techniques to aid Correct and Efficient AcceleratorProgramming

Key areas:

High level programming models

Advanced compilation techniques

Formal verification

Partners:

Imperial College London (UK)

ENS (FR)

ARM (UK)

Realeyes (ES)

RWTH Aachen University (DE)

Monoidics (UK)

University of Twente (NL)

Rightware (FI)

Page 283: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 111 / 121

CARP Approach

Page 284: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 112 / 121

References I

[1] R. Baghdadi, A. Cohen, S. Verdoolaege, and K. Trifunovic.Improved loop tiling based on the removal of spurious falsedependences.TACO, 9(4):52, 2013.

[2] R. Bagnara, P. M. Hill, and E. Zaffanella.The Parma Polyhedra Library: Toward a complete set of numericalabstractions for the analysis and verification of hardware and softwaresystems.Science of Computer Programming, 72(1–2):3–21, 2008.

[3] D. Barthou, A. Cohen, and J.-F. Collard.Maximal static expansion.In POPL ’98: Proceedings of the 25th ACM SIGPLAN-SIGACTsymposium on Principles of programming languages, pages 98–106,New York, NY, USA, 1998. ACM.

Page 285: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 113 / 121

References II

[4] D. Barthou, J.-F. Collard, and P. Feautrier.Fuzzy array dataflow analysis.J. Parallel Distrib. Comput., 40(2):210–226, 1997.

[5] C. Bastoul.Code generation in the polyhedral model is easier than you think.In PACT ’04: Proceedings of the 13th International Conference onParallel Architectures and Compilation Techniques, pages 7–16,Washington, DC, USA, 2004. IEEE Computer Society.

[6] C. Chen.Polyhedra scanning revisited.SIGPLAN Not., 47(6):499–508, June 2012.

Page 286: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 114 / 121

References III

[7] P. Clauss, F. J. Fernandez, D. Garbervetsky, and S. Verdoolaege.Symbolic polynomial maximization over convex sets and itsapplication to memory requirement estimation.IEEE Transactions on VLSI Systems, 17(8):983–996, Aug. 2009.

[8] P. Cousot and N. Halbwachs.Automatic discovery of linear restraints among variables of aprogram.In Conference Record of the Fifth Annual ACM Symposium onPrinciples of Programming Languages, pages 84–96, Tucson, Arizona,1978. ACM Press.

[9] P. Feautrier.Dataflow analysis of array and scalar references.International Journal of Parallel Programming, 20(1):23–53, 1991.

Page 287: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 115 / 121

References IV

[10] P. Feautrier and C. Lengauer.The polyhedron model.In D. Padua, editor, Encyclopedia of Parallel Computing, pages1581–1592. Springer, 2011.

[11] S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler,and O. Temam.Semi-automatic composition of loop transformations for deepparallelism and memory hierarchies.Int. J. Parallel Program., 34(3):261–317, June 2006.

[12] T. Grosser, A. Groesslinger, and C. Lengauer.Polly - performing polyhedral optimizations on a low-levelintermediate representation.Parallel Processing Letters, 22(04), 2012.

Page 288: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 116 / 121

References V

[13] W. Kelly.Optimization within a unified transformation framework.Technical Report CS-TR-3725, Dept. of CS, Univ. of Maryland,College Park, 1996.

[14] W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, andD. Wonnacott.The Omega library.Technical report, University of Maryland, Nov. 1996.

[15] W. Kelly, W. Pugh, E. Rosser, and T. Shpeisman.Transitive closure of infinite graphs and its applications.Int. J. Parallel Program., 24(6):579–598, 1996.

Page 289: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 117 / 121

References VI

[16] V. Loechner and D. K. Wilde.Parameterized polyhedra and their vertices.International Journal of Parallel Programming, 25(6):525–549, Dec.1997.

[17] V. Maslov.Lazy array data-flow dependence analysis.In H.-J. Boehm, B. Lang, and D. M. Yellin, editors, POPL, pages311–325. ACM Press, 1994.

[18] S. Verdoolaege.isl: An integer set library for the polyhedral model.In K. Fukuda, J. Hoeven, M. Joswig, and N. Takayama, editors,Mathematical Software - ICMS 2010, volume 6327 of Lecture Notesin Computer Science, pages 299–302. Springer, 2010.

Page 290: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 118 / 121

References VII

[19] S. Verdoolaege.Polyhedral process networks.Springer, 2010.

[20] S. Verdoolaege.Counting affine calculator and applications.In First International Workshop on Polyhedral CompilationTechniques (IMPACT’11), Chamonix, France, Apr. 2011.

[21] S. Verdoolaege, J. Carlos Juega, A. Cohen, J. Ignacio Gomez,C. Tenllado, and F. Catthoor.Polyhedral parallel code generation for CUDA.ACM TACO, 9(4):54, 2013.

Page 291: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 119 / 121

References VIII

[22] S. Verdoolaege, A. Cohen, and A. Beletska.Transitive closures of affine integer tuple relations and theiroverapproximations.In Proceedings of the 18th International Conference on StaticAnalysis, SAS’11, pages 216–232, Berlin, Heidelberg, 2011.Springer-Verlag.

[23] S. Verdoolaege and T. Grosser.Polyhedral extraction tool.In Second International Workshop on Polyhedral CompilationTechniques (IMPACT’12), Paris, France, Jan. 2012.

[24] S. Verdoolaege, S. Guelton, T. Grosser, and A. Cohen.Schedule trees.In Proceedings of the 4th International Workshop on PolyhedralCompilation Techniques, Vienna, Austria, Jan. 2014.

Page 292: Polyhedral Compilation without Polyhedra - Inria

Tools January 20, 2015 120 / 121

References IX

[25] S. Verdoolaege, H. Nikolov, and T. Stefanov.On demand parametric array dataflow analysis.In Third International Workshop on Polyhedral CompilationTechniques (IMPACT’12), Berlin, Germany, Jan. 2013.

[26] S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, andM. Bruynooghe.Counting integer points in parametric polytopes using Barvinok’srational functions.Algorithmica, 48(1):37–66, June 2007.

[27] D. K. Wilde.A library for doing polyhedral operations.Technical Report 785, IRISA, Rennes, France, 1993.