72
!"##$ %&'&( ! "#) +)$#) +, -./01 6.172 Performance Engineering of Software Systems © 2008-2018 by the MIT 6.172 Lecturers 1 LECTURE 15 Cache-Oblivious Algorithms Julian Shun

15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

!"##$ %&'&(!"#)*+)$#)*+,*-./01

6.172 Performance Engineering of Software Systems

© 2008-2018 by the MIT 6.172 Lecturers 1

LECTURE 15 Cache-Oblivious Algorithms

Julian Shun

Page 2: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

!"##$ %&'&(!"#)*+)$#)*+,*-./01

© 2008-2018 by the MIT 6.172 Lecturers 2

SIMULATION OF HEAT DIFFUSION

Page 3: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

Heat Diffusion

2D heat equation

Let u(t, x, y) = temperature at time t of point (x, y).

! is the thermal diffusivity.

AcknowledgmentSome of the slides in this presentation were inspired by originals due to Matteo Frigo.

© 2008-2018 by the MIT 6.172 Lecturers 3

Page 4: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

2D Heat-Diffusion Simulation

Before After

© 2008-2018 by the MIT 6.172 Lecturers 4

Page 5: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

1D Heat Equation

© 2008-2018 by the MIT 6.172 Lecturers 5

Page 6: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

Finite-Difference Approximation

The 1D heat equation thus reduces to

© 2008-2018 by the MIT 6.172 Lecturers 6

Page 7: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 7

3-Point Stencil

t

! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !

x

A stencil computationupdates each point in an array by a fixed pattern, called a stencil.

boundary

iteration space

Update ruleu[t+1][x] = u[t][x] + ALPHA

* (u[t][x+1] - 2*u[t][x] + u[t][x-1]);

Page 8: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

!"##$%&'&(!"#)*+)$#)*+,*-./01

© 2008-2018 by the MIT 6.172 Lecturers 8

CACHE-OBLIVIOUSSTENCIL COMPUTATIONS

Page 9: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 9

Parameters! Two-level hierarchy.! Cache size of M

bytes.! Cache-line length

(block size) of Bbytes.

! Fully associative.! Optimal omniscient

replacement, or LRU.

Recall: Ideal-Cache Model

2018 by the MIT 6.172 Lecturers

Performance Measures!work W (ordinary running time)! cache misses Q

P

cache

memory

BM/Bcache lines

Page 10: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 10

! ! ! ! ! ! ! ! ! ! ! !" ! ! ! " ! ! ! " ! ! !" ! ! ! " ! ! ! " ! ! !" ! ! ! " ! ! ! " ! ! !" ! ! ! " ! ! ! " ! ! !" ! ! ! " ! ! ! " ! ! !" ! ! ! " ! ! ! " ! ! !" ! ! ! " ! ! ! " ! ! !

Cache Behavior of Loopingdouble u[2][N]; // even-odd trick

static inline double kernel(double * w) {

return w[0] + ALPHA * (w[-1] – 2*w[0] + w[1]);

}

for (size_t t = 1; t < T-1; ++t) { // time loop

for(size_t x = 1; x < N-1; ++x) // space loop

u[(t+1)%2][x] = kernel( &u[t%2][x] );

Assuming LRU, if N > M, then Q = #(NT/B).

u

t

Page 11: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 11

∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙

Cache-Oblivious 3-Point Stencil

x

t

t0

t1

x0 x1

height

Recursively traverse trapezoidal regions of space-time points (t,x) such that

t0 ≤ t < t1x0+dx0(t-t0) ≤ x < x1+dx1(t-t0)

dx0, dx1 ∈ {–1, 0, 1}

width

Page 12: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 12

∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙

Base Case

x

t

If height = 1, compute all space-time points in the trapezoid. Any order of computation is valid, since no point depends on another.

Page 13: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 13

∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙

Space Cut

x

t

If width ≥ 2·height, cut the trapezoid with a line of slope -1 through the center. Traverse the trapezoid on the left first, and then the one on the right.

Page 14: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 14

Time Cut

x

t

If width < 2·height, cut the trapezoid with a horizontal line through the center. Traverse the bottom trapezoid first, and then the top one.

∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙

Page 15: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 15

C Implementation

© 2008 2018 by the MIT 6.172 Lecturers

void trapezoid(int64_t t0, int64_t t1, int64_t x0, int64_t dx0, int64_t x1, int64_t dx1)

{int64_t lt = t1 - t0;if (lt == 1) { //base case

for (int64_t x = x0; x < x1; x++) u[t1%2][x] = kernel( &u[t0%2][x] );

} else if (lt > 1) {if (2 * (x1 - x0) + (dx1 - dx0) * lt >= 4 * lt) { //space cutint64_t xm = (2 * (x0 + x1) + (2 + dx0 + dx1) * lt) / 4;trapezoid(t0, t1, x0, dx0, xm, -1);trapezoid(t0, t1, xm, -1, x1, dx1);

} else { //time cutint64_t halflt = lt / 2;trapezoid(t0, t0 + halflt, x0, dx0, x1, dx1);trapezoid(t0 + halflt, t1, x0 + dx0 * halflt, dx0,

x1 + dx1 * halflt, dx1);}

}}

Page 16: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 16

Cache Analysis

!(1)

W(n/2) W(n/2)!(1) !(1)

!(1) !(1)

Recursion tree

!(1)

" Each leaf represents !(hw) points, where h = !(w)." Each leaf incurs !(w/B) misses, where w = !(M)." !(NT/hw) leaves." #internal nodes = #leaves - 1 do not contribute substantially to Q." Q = !(NT/hw)·!(w/B) = !(NT/M2)·!(M/B) = !(NT/MB)." For d dimensions, Q = !(NT/M1/dB)

# # #

Page 17: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 17

Simulation: 3-Point Stencil

! Fully associative LRU cache" B = 4 points" M = 32 points

! Cache-hit latency = 1 cycle! Cache-miss latency = 10 cycles

! Rectangularregion" N = 95" T = 87

Looping Trapezoid

Page 18: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 18

Looping v. Trapezoid on Heat

Page 19: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 19

Impact on PerformanceQ. How can the cache-oblivious trapezoidal

decomposition have so many fewer cache misses,but the advantage gained over the looping versionbe so marginal?

A. Prefetching and a good memory architecture. Onecore cannot saturate the memory bandwidth.

Memory I/O

$

P

$

P

$

P

$ $$ $$

Network

$$$$$

Plenty of bandwidth for 1 core

Page 20: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

!"##$%&'&(!"#)*+)$#)*+,*-./01

© 2008-2018 by the MIT 6.172 Lecturers 20

CACHING ANDPARALLELISM

Page 21: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 21

Cilk and CachingTheorem. Let QP be the number of cache misses in a deterministic Cilk computation when run on Pprocessors, each with a private cache, and let SP be the number of successful steals during the computation. In the ideal-cache model, we have

QP = Q1 + O(SPM/B) ,where M is the cache size and B is the size of a cache block.Proof. After a worker steals a continuation, its cache is completely cold in the worst case. But after M/B (cold) cache misses, its cache is identical to that in the serial execution. The same is true when a worker resumes a stolen subcomputation after a cilk_sync. The number of times these two situations can occur is at most 2SP. ∎MORAL: Minimizing cache misses in the serial elision essentially minimizes them in parallel executions.

Page 22: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 22

∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙

Does this work in parallel?

x

t

Space cut: If width ≥ 2·height, cut the trapezoid with a line of slope -1 through the center. Traverse the trapezoid on the left first, and then the one on the right.

Page 23: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 23

Parallel Space Cuts

A parallel space cut produces two black trapezoids that can be executed in parallel and a third gray trapezoid that executes in series with the black trapezoids.

21 1

t

x

upright trapezoid

Page 24: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 24

Parallel Space Cutst

x

2 21

A parallel space cut produces two black trapezoids that can be executed in parallel and a third gray trapezoid that executes in series with the black trapezoids.

inverted trapezoid

Page 25: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 25

Parallel Looping v. Parallel Trap.

Page 26: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 26

Performance Comparison

Heat equation on a 3000×3000 grid for 1000time steps (4 processor cores with 8MB LLC)

Code TimeSerial looping 128.95sParallel looping 66.97sSerial trapezoidal 66.76sParallel trapezoidal 16.86s

The parallel looping code achieves less than half the potential speedup, even though it has far more parallelism.

1.93x

3.96x

}}

Page 27: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 27

Memory Bandwidth

Memory I/O

$

P

$

P

$

P

Network

MPotential

bottleneck!

Page 28: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 28

Impediments to Speedup□ Insufficient parallelism□ Scheduling overhead□ Lack of memory bandwidth□Contention (locking and true/false sharing)

Cilkscale can diagnose the first two problems.

Q. How can we diagnose the third?A. Run P identical copies of the serial code in parallel

— if you have enough memory.

Tools exist to detect lock contention in an execution, but not the potential for lock contention. Potential for true and false sharing is even harder to detect.

PPPP

Page 29: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

!"##$%&'&(!"#)*+)$#)*+,*-./01

© 2008-2018 by the MIT 6.172 Lecturers 29

CACHE-OBLIVIOUSSORTING

Page 30: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 30

OUTLINE

•Simulation of Heat Diffusion•Cache-Oblivious StencilComputations•Caching and Parallelism•Cache-Oblivious Sorting

Page 31: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 31

!"#$ %&'(&)#*+,-.+ /01 #*+,-.+ /21 #*+,-.+ *31#*+,-.+ /41 #*+,-.+ *56 7

89#:& )*3;< == *5;<6 7#> )/2 ?@ /46 7/0AA @ /2AAB *3CCB

D &:E& 7/0AA @ /4AAB *5CCB

DD89#:& )*3;<6 7/0AA @ /2AAB *3CCB

D89#:& )*5;<6 7/0AA @ /4AAB *5CCB

DD

3 12 19 46

4 14 21 23

193

4

12

14 21 23

46

Merging Two Sorted Arrays

Time to merge nelements = !(n).

Number of cache misses = !(n/B).

Page 32: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 32

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

19 12 46 33 21 14

Merge Sort

43

Page 33: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 33

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

19 12 46 33 21 14

Merge Sort

43

Page 34: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 34

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

19 12 46 33 21 14

Merge Sort

43

Page 35: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 35

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

2018 by the MIT 6.172 Lecturers

19 12 46 33 21 14

Merge Sort

3 4

Page 36: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 36

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

2018 by the MIT 6.172 Lecturers

19 12 46 33 21 14

Merge Sort

3 4

4 3319 46 143 12 21merge

Page 37: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 37

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

2018 by the MIT 6.172 Lecturers

19 12 46 33 21 14

Merge Sort

3 4

4 3319 46 143 12 21

46 333 12 19 4 14 21merge

Page 38: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 38

!"#$ %&'(&)*"'+,#-+./)+ 012 #-+./)+ 032 #-+./)+ -4 5

#6 ,-7784 5

19:; 7 39:;<

= &>*& 5

#-+./)+ ?9-;<

@#>A)*BCD- %&'(&)*"'+,?2 32 -EF4<

%&'(&)*"'+,?G-EF2 3G-EF2 -H-EF4<

@#>A)*I-@<

%&'(&,12 ?2 -EF2 ?G-EF2 -H-EF4<

=

=

2018 by the MIT 6.172 Lecturers

19 12 46 33 21 14

Merge Sort

3 4

4 3319 46 143 12 21

46 333 12 19 4 14 21

46143 4 12 19 21 33merge

Page 39: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 39

void merge_sort(int64_t *B, int64_t *A, int64_t n) {if (n==1) {B[0] = A[0];

} else {int64_t C[n];merge_sort(C, A, n/2);merge_sort(C+n/2, A+n/2, n-n/2);merge(B, C, n/2, C+n/2, n-n/2);

}}

= !(n lgn)Work: W(n) = 2W(n/2) + !(n)

CASE 2nlogba = nlog22 = nf(n) = !(nlogbalg0n)

Work of Merge Sort

Page 40: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 40

Recursion Tree

Solve W(n) = 2W(n/2) + Θ(n).

Page 41: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 41

Recursion Tree

W(n)

Solve W(n) = 2W(n/2) + Θ(n).

Page 42: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 42

Recursion Tree

W(n/2) W(n/2)

n

Solve W(n) = 2W(n/2) + Θ(n).

Page 43: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 43

Recursion Tree

n

W(n/4) W(n/4) W(n/4) W(n/4)

n/2 n/2

Solve W(n) = 2W(n/2) + Θ(n).

Page 44: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 44

Recursion Tree

n

n/4 n/4 n/4 n/4

n/2 n/2

Θ(1)

…Solve W(n) = 2W(n/2) + Θ(n).

Page 45: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 45

Recursion Tree

h = lg n

n

n/4 n/4 n/4 n/4

n/2 n/2

Θ(1)

Solve W(n) = 2W(n/2) + Θ(n).…

Page 46: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 46

Recursion Tree

n

h = lg n

n

n/4 n/4 n/4 n/4

n/2 n/2

Θ(1)

Solve W(n) = 2W(n/2) + Θ(n).…

Page 47: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 47

Recursion Tree

n

Θ(1)

h = lg n

n

n

n/4 n/4 n/4 n/4

n/2 n/2

Solve W(n) = 2W(n/2) + Θ(n).…

Page 48: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 48

Recursion Tree

n

Θ(1)

h = lg n

n

n

nn/4 n/4 n/4 n/4

n/2 n/2

Solve W(n) = 2W(n/2) + Θ(n).…

Page 49: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 49

#leaves = n

Recursion Tree

!(n)

n

!(1)

h = lg n

n

n

n

n/4 n/4 n/4 n/4

n/2 n/2

Solve W(n) = 2W(n/2) + !(n).…

Page 50: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 50

#leaves = n

Recursion Tree

!(n)

n

!(1)

h = lg n

n

n

n

n/4 n/4 n/4 n/4

n/2 n/2

Solve W(n) = 2W(n/2) + !(n).

W(n) = !(n lg n)

Page 51: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 51

Now with Caching

Merge subroutine

Q(n) = Θ(n/B) .

Merge sort

Θ(n/B) if n ≤ cM, constant c ≤ 1;2Q (n/2) + Θ(n/B) otherwise.Q(n) =

Page 52: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 52

Cache Analysis of Merge Sort

Q(n)

Θ(n/B) if n ≤ cM, constant c ≤ 1;2Q (n/2) + Θ(n/B) otherwise.Q(n) =

Recursion tree

Page 53: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 53

Cache Analysis of Merge Sort

Q(n/2) Q(n/2)

n/B

Θ(n/B) if n ≤ cM, constant c ≤ 1;2Q (n/2) + Θ(n/B) otherwise.Q(n) =

Recursion tree

Page 54: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 54

Cache Analysis of Merge Sort

Q(n/4) Q(n/4) Q(n/4) Q(n/4)

n/2B n/2B

n/B

Θ(n/B) if n ≤ cM, constant c ≤ 1;2Q (n/2) + Θ(n/B) otherwise.Q(n) =

Recursion tree

Page 55: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 55

Cache Analysis of Merge Sort

Q(cM!

h=lg(n/cM)

n/4B n/4B n/4B n/4B

n/2B n/2B

n/B

!(n/B) if n " cM, constant c " 1;2Q (n/2) + !(n/B) otherwise.Q(n) =

Recursion tree

#leaves = n/cM

Page 56: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 56

#leaves = n/cM

Cache Analysis of Merge Sort

!(n/B)!(M/B)

h=lg(n/cM) n/B

n/B

n/Bn/4B n/4B n/4B n/4B

n/2B n/2B

Q(n) = !((n/B) lg(n/M))

n/B

!(n/B) if n " cM, constant c " 1;2Q (n/2) + !(n/B) otherwise.Q(n) =

Recursion tree

Page 57: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 57

Bottom Line for Merge Sort

= Θ( (n/B) lg (n/M) ) .

∙ For n ≫ M, we have lg(n/M) ≈ lg n, and thusW(n)/Q(n) ≈ Θ(B).∙ For n ≈ M, we have lg(n/M) ≈ Θ(1), and thusW(n)/Q(n) ≈ Θ(B lg n).

Θ(n/B) if n ≤ cM, constant c ≤ 1;2Q (n/2) + Θ(n/B) otherwise;Q(n) =

Page 58: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 58

Multiway Merging

IDEA: Merge R < n subarrays with a tournament.

Rn

n/R lg R

!Tournament takes "(R)work to produce the firstoutput.

6

2

7

2

2

7

#

1724$

628$

214$

2022$

711$

Page 59: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 59

Multiway Merging

Rn

n/R lg R

!Tournament takes "(R)work to produce the firstoutput.

6

2

7

2

2

7

#

1724$

628$

214$

2022$

711$

IDEA: Merge R < n subarrays with a tournament.

Page 60: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 60

Multiway Merging

Rn

n/R lg R

!Tournament takes "(R)work to produce the firstoutput.! Subsequent outputs cost"(lg R) per element.

2

IDEA: Merge R < n subarrays with a tournament.

6

14

7

6

6

7

1724#

628#

14#

2022#

711#

$

Page 61: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 61

Multiway Merging

Rn

n/R lg R

!Tournament takes "(R)work to produce the firstoutput.! Subsequent outputs cost"(lg R) per element.

6

IDEA: Merge R < n subarrays with a tournament.

2

6

14

7

6

6

7

1724#

28#

14#

2022#

711#

$

Page 62: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 62

Multiway Merging

Rn

n/R lg R

26

!Tournament takes "(R)work to produce the firstoutput.! Subsequent outputs cost"(lg R) per element.

!Total work merging= "(R+n lg R) = "(n lg R).

IDEA: Merge R < n subarrays with a tournament.

24#

28#

14#

2022#

711#

$

17

14

7

7

Page 63: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 63

Work of Multiway Merge Sort

!(1) if n = 1;R·W(n/R) + !(n lg R) otherwise.W(n) =

R

#leaves = n"

!(1)

logRn

W(n) = !((n lg R) logRn+n)= !((n lg R)(lg n)/lg R+n)= !(n lg n)

Recursion tree

Same as binary merge sort.

(n/R) lg R (n/R) lg R(n/R) lg RR

n lg R

n lg R

n lg R

#!(n)

n lg R

(n/R2) lg R(n/R2) lg R (n/R2) lg R

Page 64: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 64

Caching Recurrence

Assume that we have R < cM/B for a sufficiently small constant c ≤ 1.

Consider the R-way merging of contiguous arrays of total size n. If R < cM/B, the entire tournament plus 1 block from each array can fit in cache.⇒ Q (n) ≤ Θ(n/B) for merging.

R-way merge sort

Θ(n/B) if n < cM;R·Q(n/R) + Θ(n/B) otherwise.Q(n) ≤

Page 65: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 65

Cache Analysis

R

#leaves = n/cM!

logR(n/cM)

Q(n) = "((n/B) logR(n/M))

Recursion tree

n/RB n/RBn/RBR

#

$

n/B

n/B

n/Bn/R2B n/R2Bn/R2B #

n/B

"(M/B) "(n/B)

"(n/B) if n < cM;R·Q(n/R) + "(n/B)

otherwise.

Q(n) %

Page 66: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 66

Tuning the Voodoo Parameter

By the tall-cache assumption and the fact that logM (n/M) = Θ((lg n)/lgM), we have

Q(n) = Θ((n/B) logM/B(n/M))= Θ((n/B) logM (n/M))= Θ((n lg n)/B lgM) .

Hence, we have W(n)/Q(n) ≈ Θ(B lgM).

We have

Q(n) = Θ((n/B) logR(n/M)) ,

which decreases as R ≤ cM/B increases. Choosing R as big as possible yields

R = Θ(M/B) .

Page 67: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 67

Multiway versus Binary Merge SortWe have

Qmultiway(n) = Θ((n lg n)/B lgM)

versus

Qbinary(n) = Θ((n/B) lg(n/M))

= Θ((n lg n)/B) ,

as long as n ≫M, because then lg(n/M) ≈ lg n. Thus, multiway merge sort saves a factor of Θ(lgM) in cache misses.

Example (ignoring constants)∙ L1-cache: M = 215 ⇒ 15× savings.∙ L2-cache: M = 218 ⇒ 18× savings.∙ L3-cache: M = 223 ⇒ 23× savings.

Page 68: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 68

Optimal Cache-Oblivious Sorting

Funnelsort [FLPR99]

1. Recursively sort n1/3 groups of n2/3 items.

2. Merge the sorted groups with an n1/3-funnel.

A k-funnel merges k3 items in k sorted lists, incurring at most

cache misses. Thus, funnelsort incurs

cache misses, which is asymptotically optimal [AV88].

Page 69: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 69

k

Construction of a k-funnel

buffers

k3items

Tall-cache assumption: M = !(B2).

Cache misses= O(k + (k3/B)(1+logMk)).

Subfunnels in contiguous storage.Buffers in contiguous storage.

Refill buffers on demand.Space = O(k2).

k3/2

Page 70: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 70

Other C-O Algorithms

Matrix Transposition/AdditionStraightforward recursive algorithm.

Θ(1+mn/B)

Fast Fourier TransformVariant of Cooley-Tukey [CT65] using cache-oblivious matrix transpose.

Θ(1 + (n/B)(1 + logMn))

LUP-DecompositionRecursive algorithm due to Sivan Toledo [T97].

Θ(1 + n2/B + n3/BM1/2)

Strassen’s AlgorithmStraightforward recursive algorithm.

Θ(n + n2/B + nlg 7/BM(lg 7)/2 - 1)

Page 71: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

© 2008-2018 by the MIT 6.172 Lecturers 71

C-O Data StructuresOrdered-File Maintenance

INSERT/DELETE or delete anywhere in file while maintaining O(1)-sized gaps. Amortized bound [BDFC00], later improved in [BCDFC02].

O(1 + (lg2n)/B)

B-Trees INSERT/DELETE: O(1+logB+1n+(lg2n)/BSEARCH: O(1+logB+1n)

TRAVERSE: O(1+k/B)

Priority QueuesFunnel-based solution [BF02]. General scheme based on buffer trees [ABDHMM02] supports INSERT/DELETE.

O(1+(1/B)logM/B(n/B))

Solution [BDFC00] with later simplifications [BDIW02], [BFJ02].

Page 72: 15 #)*+)$#)*+,*-./01 Cache-Oblivious Algorithms · Heat Diffusion 2D heat equation Let u(t, x, y) = temperature at time t of point (x, y). ! is the thermal diffusivity. V U f V Y

MIT OpenCourseWare https://ocw.mit.edu

6.172 Performance Engineering of Software SystemsFall 2018

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.