28
Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Dynamic Scheduling and Dynamic Percolation

Elkin GarciaCAPSL – UD

Based on a presentation made by Rishi KhanET International

1

Page 2: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Motivation

• Static Scheduling is not eable to achieve the maximum performance on a many-core Archtecture (C64)

• EVEN FOR REGULAR APPLICATIONS LIKE MATRIX MULTIPLICATION

2

Page 3: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Issues of Static Scheduling• Blocks are not necessarily multiples of the Optimal

Tile Size (OTS)• Extra overhead for processing non-optimal sized tiles.

• It is worst when many processors share a small fixed amount of on-chip memory

3

Page 4: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Issues of Static Scheduling (2)

• SS does not consider stalls due to arbitration of shared resources.

• SS assumes that TUs doing the same amount of work will complete at the same time.

• Many-core architectures have plenty of shared resources:– FPUs, crossbar, memory, and I-Caches that can

produce unexpected stalls.

4

Page 5: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Dynamic Scheduling (DS)Balances optimally in presence of shared

resources with higher efficiency.

- Partition of matrix C only in tiles.

- Use of atomic operations for low overhead.

5

X =

A B C

Page 6: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Results on Cyclops64

6

Page 7: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Dynamic Percolation (DP)

• Assign tasks (codelets) to TU at runtime with low overhead using a lock-free queue:

• Computation tasks: Compute optimum tiles of 6x6.

• Data movement tasks: Move inputs and outputs between SRAM and DRAM using double buffering.

7

Page 8: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Codelets

• A nonpreemptive set of code that can run to completion once it’s dependencies/Events are met.

• Dependencies/Events: – Data Dependencies– Resource Constrains (Threads/BW)– Desired Behavior (Power)

8

Page 9: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Matrix Multiply in SRAM using Petri nets

INIT

INIT

Comp1Comp1

Clean

Clean

10241024

TT

LowLow

Size: 192x192

X =

A B C9

PLACE

TRANSITION

TOKENS

Page 10: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Matrix Multiply in DRAM

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

88

TT TT

FF

TT

LowLow

10

Page 11: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

TT

TT

LowLow

Double Buffer Computation Example

11

Page 12: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High Init Set 1

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

ΔT

11

11

Low

Double Buffer Computation Example

12

Page 13: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

Init Set 2

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

Comp1(1024)

11

10241024

Double Buffer Computation Example

13

Page 14: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

Comp1(1020)

10201020

10241024

Comp2(1024)

Double Buffer Computation Example

14

Page 15: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

10241024

Comp2(1024)

Clean

Double Buffer Computation Example

15

Page 16: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow11

10201020

Comp2(1020)

Init Copy Set

Double Buffer Computation Example

16

Page 17: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

88

10001000

Comp2(1000)

Copy (8)

Double Buffer Computation Example

17

Page 18: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

500500

Comp2(500)

Clean

Double Buffer Computation Example

18

Page 19: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow11

495495

Comp2(495)

Done

Double Buffer Computation Example

19

Page 20: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

11

490490

Comp2(490)

Init Set 1

Double Buffer Computation Example

20

Page 21: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

10241024

480480

Comp2(480) Comp1(1024)

Double Buffer Computation Example

21

Page 22: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

10241024

Comp1(1024)

Clean

Double Buffer Computation Example

22

Page 23: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

INIT

INIT

Comp1Comp1

Clean

Clean

INIT

INIT

Copy1Copy1

Clean

Clean

10241024Done

Done

INIT

INIT

Comp2Comp2

Clean

Clean

INIT

INIT

Copy2Copy2

Clean

Clean

10241024Done

Done

ΔTΔT

StartStart

88

88

TT

TT

TT

TT

FF

FF

LowLow

High

Low

TT

TT

Rules:Always take highest priority task firstIf two tasks have the same priority, take the task that was enabled firstOtherwise, choose arbitrarily

LowLow

10201020

Comp1(1020)

Init Copy Set 2

11

Double Buffer Computation Example

23

Page 24: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

The Cool Demo on MxM

• Due to Rishi Khan (ETI)

04/21/23 UHPC-Portland-Meeting-06-2009 24

Page 25: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Final Results

25

Page 26: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Scalability

26

Page 27: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

Summary• Static Optimizations increase performance

substantially. • Dynamic Scheduling and Dynamic Percolation

mitigates the unpredictable effects of resource sharing.

• Optimizations implemented are also power efficient.

27

Page 28: Dynamic Scheduling and Dynamic Percolation Elkin Garcia CAPSL – UD Based on a presentation made by Rishi Khan ET International 1

04/21/23 Tutorial Project Part 2 28

Acknowledgements

• Professor Guang Gao• ETI and CAPSL people that have help on this

project (Rishi Khan, Daniel Orozco, Kelly Livingston, Ioannis Venetis)

• Members of CAPSL