20
March 4, 2008 http://csg.csail.mit.edu/ arvind DF3-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 Dynamic Dataflow Arvind Computer Science Artificial Intelligence Lab Massachusetts Institute of Technology

Embed Size (px)

DESCRIPTION

March 4, 2008 DF3-3http://csg.csail.mit.edu/arvind/ Dynamic Dataflow Architectures Allocate instruction templates, i.e., a frame, dynamically to support each loop iteration and procedure call termination detection needed to deallocate frames The code can be shared if we separate the code and the operand storage frame pointer instruction pointer a token

Citation preview

Page 1: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 http://csg.csail.mit.edu/arvind DF3-1

Dynamic Dataflow

ArvindComputer Science & Artificial Intelligence LabMassachusetts Institute of Technology

Page 2: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-2http://csg.csail.mit.edu/arvind/

Static Dataflow:Problems/Limitations

Mismatch between the model and the implementation

The model requires unbounded FIFO token queues per arc but the architecture provides storage for one token per arc

The architecture does not ensure FIFO order in the reuse of an operand slot

The merge operator has a unique firing ruleThe static model does not support

Function calls Data Structures

- No easy solution in the static framework- Dynamic dataflow provided a framework for solutions

Page 3: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-3http://csg.csail.mit.edu/arvind/

Dynamic Dataflow Architectures

Allocate instruction templates, i.e., a frame, dynamically to support each loop iteration and procedure call

termination detection needed to deallocate frames

The code can be shared if we separate the code and the operand storage

<fp, ip, port, data>

frame pointer

instruction pointer

a token

Page 4: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-4http://csg.csail.mit.edu/arvind/

A Frame in Dynamic DataflowProgram1

2345

+

*-+

3

1

2

4

5

3L, 4L3R, 4R

5L5Rout*

a b

+ *7

- +

*

yx

1 2

3 4

5

Need to provide storage for only one operand/operator

<fp, ip, p , v>

12

45

73

Frame

Page 5: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-5http://csg.csail.mit.edu/arvind/

Data Structures in Dataflow

. . . . PP

MemoryData structures reside in a structure store tokens carry pointers

I-structures: Write-once, Read multiple times or allocate, write, read, ..., read,

deallocate No problem if a reader arrives

before the writer at the memory location

I-fetch

a

I-store

a v

Page 6: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-6http://csg.csail.mit.edu/arvind/

I-Structure Storage: Split-phase operations & Presence bits

I-Fetch

t

<s, fp, a >s

1234a

aaa

v2fp.ip

I-structureMemory

• Need to deal with multiple deferred reads • other operations: fetch/store, take/put, clear

v1

address to be read

t

I-Fetch <a, Read, (t,fp)>s

split phase

forwarding address

Page 7: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-7http://csg.csail.mit.edu/arvind/

Dynamic Dataflow Graphs

Page 8: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-8http://csg.csail.mit.edu/arvind/

Firing Rules

Input tokens must have identical tags

+

<c•s,x>L <c•s,y>R

s:

t1:

L R

t2:

+S:

t2:t1:

<c•t1,x+y>p1 <c•t2,x+y>p2

p1 p2

context (fp)inst no (ip)port

Page 9: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-9http://csg.csail.mit.edu/arvind/

Well Behaved Graphs• • •

• • • • • •

• • •c • s1 c • sn

c • t1 c • tm

1. One-in-one-out property.2. Self Cleaning: Initially the graph contains no tokens; when all the output tokens have been produced no tokens remain in the graph.

Page 10: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-10http://csg.csail.mit.edu/arvind/

Consequences of TaggingNo need of merge in well behaved dataflow graphs.

Even if f(x1) completes after g(x2) the output tokens will carry the "correct" tag.

X

F T

x2x1

T F

f g

Page 11: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-11http://csg.csail.mit.edu/arvind/

Making Graphs Well BehavedSometimes extra inputs and outputs have to be provided to make graphs well behaved.

constants

3 3trigger input.

Page 12: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-12http://csg.csail.mit.edu/arvind/

Making Conditionals

Well BehavedAll unconnected outputs of switches are collected to generate a signal.

X Yb

p f

T F T F

X

signal

X

S

Page 13: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-13http://csg.csail.mit.edu/arvind/

I-Structure Operations

I-store I-store

vs

t p

s

t pI-structure Storage

< , "write", v>

signal

I-store: <c•s, >1 x <c•s,v>2 < ,"write",v> x <c•t,_>p

address value signal

I-store won't be well behaved without a signal output

Page 14: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-14http://csg.csail.mit.edu/arvind/

Well Behaved Blocks

All the signals in a block or a procedure are collected via a tree of synchronizing gates to generate a single signal.

signal

S S

S

signal inputs

Page 15: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-15http://csg.csail.mit.edu/arvind/

Procedure Linkage Operatorsf

get frame extract tag

change Tag 0

change Tag 0

Graph for f

change Tag 1

a1

1:

change Tag n

an

n:

...

change Tag 1

Fork

token in frame 0token in frame 1

Like standard call/return but caller & callee can be active simultaneously

Page 16: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-16http://csg.csail.mit.edu/arvind/

Unbounded Loop Schema

loop body

X

X

change tag

release context

change tag

• • •

• • •

• • •

T F

p

parentcontext

getcontext

X

••

Page 17: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-17http://csg.csail.mit.edu/arvind/

Reusing ContextsCan we tie the allocation of context in iteration i with the release of context in iteration i-k?

Start with k contexts, c0, c1, ... ck-1 Bound the loop so that no more than k iterations can

execute concurrently. At the end of iteration i, use the context from

iteration (i+1) mod k. Release the contexts when the loop terminates

. . . . . ck-1c0 c1

Page 18: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-18http://csg.csail.mit.edu/arvind/

Bounded Loops

The parent allocates k frames and stores the linking information and the loop constants.The name of the previous frame and the parent frame is stored in all frames but the next frame slot of the last frame is left empty.When an iteration begins it empties the next slot and when it completes it fills the next slot of the previous frame.

. . .

nextpreviousparentloopconstants

0 1 k-1

Page 19: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-19http://csg.csail.mit.edu/arvind/

Unbounded Loop Schema

loop body

X

X

change tag

release context

change tag

• • •

• • •

• • •

T F

p

parentcontext

getcontext

X

••

Bounded

get “next”

extract “context” put “previous”

Page 20: March 4, 2008  Dynamic Dataflow Arvind Computer Science  Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-20http://csg.csail.mit.edu/arvind/

Next: Id/pH to Dataflow Graphs