47
A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education Center The University of Tokyo

A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

Embed Size (px)

Citation preview

Page 1: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

A hardware-software co-design approach with separated verification/synthesis between computation and communication

Masahiro Fujita

VLSI Design and Education CenterThe University of Tokyo

Page 2: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

2

State-of-the-art SoC design System on a chip (SoC) C-based design description down to implementatio

n

IP core1DSP

Bus IF

Interconnect

CPU

HW1DSP

Interconnect

Mem1

Mem2HW2

IP core2Mem1

Bus IF

IP core3HW1

Bus IF

IP core4HW2

Bus IF

IP core5Mem2

Bus IF

IP core6CPU

Bus IF

IF

void main() { a = read(); b = read(); c = func(a, b); write(c);}

IP library

Page 3: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

3

Design reuse is extremely important in SoC designs

IP (Intellectual Property) core reuse Existing designs have been verified Interface may/may not match

Bus

CPUMemory

ControllerMPEG

AnalogI/F

PowerSource

Memory

MemoryController1

MemoryController2

Bus1

Bus2CPU2

CPU1

IP libraryEx: MPEG video systemSelect IP with

required functionality

Page 4: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

4

Need protocol transducers…

CPU MPEG RAM

CustomHWDMAC

CPU(IP)

RAM(IP)

DMAC(IP)

MPEG RAM

CustomHWDMAC

RAM(IP)

DMAC(IP)

CPU(IP)

Trans-ducer

Interconnect(Bus)

Different on-chip bus protocolsProtocol A Protocol B

MPEG RAM

CustomHWDMAC

RAM(IP)

DMAC(IP)

CPU(IP)

Trans-ducer

Solution

Functionality is satisfied, but its interface does not match

Communication on the interconnect is based on different protocols

Insert “Protocol Transducer” for conversions

Protocol transducer should be automatically generated

Page 5: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

5

Basic ways of thinking and our proposal

Like to come with a methodology for large and complicated system designs Design reuse is a key

Separation of concerns is essential Computation and communication (control and datapath) must b

e clearly separated in some ways What we propose

New way to design communication protocols (Special mechanisms for rectification after manufacturing)

…Multiple of these

SDRAM

CPU

H/WDSP

Bus

Mem

MemH/W

SoC

PCB

Analog

Mechanical

Page 6: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

6

Propose design methods for communication interface design with clear separation between computation and communication How the separation helps design efficiency

Today’s topic

IP core1DSP

Bus IF

Interconnect

CPU

HW1DSP

Interconnect

Mem1

Mem2HW2IP core2Mem1

Bus IF

IP core3HW1

Bus IF

IP core4HW2

Bus IF

IP core5Mem2

Bus IF

IP core6CPU

Bus IF

IF

Interface/communication

ComputationInterface/communication

i i

Res.Req.

Page 7: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

7

Propose various rectification methods for computation and communication Designs can be debugged after manufacturing Propose different mechanisms for comp. and

comm.

Our on-going relating research

IP core

Bus protocol IF(in-field programmable)

With programmable elements

Original circuit

Programmableelements

LUT

LUT

LUT

Page 8: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

8

Outline Motivation Background: State-of-the-art design methodology

C-based design Proposed method and its application to interface designs f

or computing elements Key technology for IP reuse Separation of concerns: computation and communicati

on (control and datapath) Application to dynamically reconfigure computing (if time al

lowed)

CPU MPEG RAM

CustomHWDMAC

CPU(IP)

RAM(IP)

DMAC(IP)

Bus

Protocol A Protocol B

MPEG RAM

CustomHWDMAC

RAM(IP)

DMAC(IP)

CPU(IP)

Trans-ducer

Protocol A1

Page 9: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

9

For improving design productivities

Start the design in higher abstraction levels C language based HW descriptions is

100~10000 more compact that gate level descriptions

The number of lines that on designer can describe per day is limited

Reuse of existing designs So called IP reuse in LSI designs Key is to separate computation and

communication

Interface/communication

ComputationInterface/communication

RTL

Gates

High levelC/C++

Page 10: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

10

Starting from C/C++ designs/specifications Extraction of parallelisms Partition of HW and SW

Based on profiling: performance critical parts (mostly loops) are assigned to HW

Design issues for large complicated systems

void main() { a = read(); b = read(); c = func(a, b); write(c);}

IP Library

SDRAM

CPU

H/WDSP

Bus

Mem

MemH/W

IP reuse designIP reuse design

CompilationCompilation

High level synthesisHigh level synthesis

Page 11: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

11

C/C++ based design and specification languages for SoC designs

SystemC and SpecC are most common

Based on C/C++ Structural hierarchy

behavior, module Parallelism with event

based synchronization par wait, notify channels

Others … Support hardware-softwar

e co-designs

・・・C

・・・

・・・C

・・・

・・・C

notify・・・

・・・C

wait・・・

par

behavior b1 behavior b2

channels

Communication through shared variables

Page 12: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

12

Claim in this talk: Separation of concerns

Even inside interface control and datapath should be clearly separated

Interface/communication

ComputationInterface/communication

This is not sufficient !

ComputationControl

DatapathControl

Datapath

Separation of computation and communication

A protocol is a collection of sequences Each sequence can operate independently

Protocol

Sequence1

Sequence2

Sequence3

Sequence4

Hardwaredefinition

( Read)(Write)( 4 burst read)( 4 burst write)

Automaton1

Port, signal names, etc.

Automaton2

For request orblockingFor response

i(stb==1)ack<=0

ack<=1

ack<=0

All sequences share initial state・・・

Page 13: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

13

Goal: automatic generation of “correct” bus interfaces

Communication protocol can be arbitrarily complicated Blocking, non-blocking, out-out-order, tags, etc.

Deal only with state-of-the-art bus protocols Specification documents are over 200 pages Mostly subsets of them are actually used

Formally verify the definition of protocol in automaton and automatically generate interface circuits from them

If necessary, change their functionality in the fields

Assuming C-based designs

Page 14: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

14

State-of-the-art on-chip bus protocol example: OCP (Open Core Protocol)

Interface Protocol proposed by OCP-IP Configurable interface protocol

Data/Address width, Burst/OutOfOrder features, … At basic configuration, interface has 8 signals (includin

g clock and reset)

Full specification documents over 200 pages More than 30 different transactions/sequences

OCPMaster

OCPSlave

MCmdMAddrMData

SCmdAcceptSRespSData

Page 15: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

15

What protocol transducer does

Change from protocol A to B Protocols can be very

complicated Over 30 different commands

defined in the protocols Manuals over 200 pages Transactions (sequences) such

as Bust, out-of-order modes, … Each transaction (sequence) is

sent/received one at a time Protocols can be defined with

FSM/automaton State-of-the-art protocols may

need extremely large and complicated FSM/automaton

Protocol B

MPEG RAM

CustomHWDMAC

RAM(IP)

DMAC(IP)

CPU(IP)

Trans-ducer

Protocol A

Protocol A

Protocol B

Page 16: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

16

Our scenario Start with C/C++ based descripti

ons for SoC Apply control/data flow analysis f

or computation Use protocol converter generator i

n communication interface synthesis Convert the descriptions’ protocol

to the target protocol Protocols themselves are formall

y verified with model checkers

HW/SW generated

(scheduled/allocated)

ProtocolConverterin HW/SW

Original designsin C/C++

Verificationthrough

SDG traversal

HW/SW synthesis(manual/automatic)

Protocol extraction

Protocolin design

Protocol Convertergenerator

ProtocolLibrary(target

Protocol)

Model checkingon protocol definitions[1] K. Tanabe, S. Sasaki, and M. Fujita, “Program Slicing for System Level Designs in

SpecC”, In Proc. of the IASTED, p.p. 252-258, Nov. 2004[2] S, Sasaki, M. Fujita, et al. FSEN 05.

Much small numbers of states to be checked than actual designs

Much small numbers of states to be checked than actual designs

Page 17: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

17

Example:  Non-Blocking protocol conversion

MASTER: OCP(Single Read, Non-Posted Write)

Request Response Request Response

SLAVE: OCP (Single Read, Single Write)

Single Read

Non-Posted Write

Single Read

Single Write

Page 18: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

18

Conversion example

FSM for Response(FIFO-ready)

FSM for Request(FIFO-ready)

Master SlaveFIFO(2bit x 4)

M_MCmd

M_MAddr

M_MData

M_SCmdAccept

S_MCmd

S_MAddr

S_MData

S_SCmdAccept

M_SResp

M_SData

S_SResp

S_SData

WD

ata

PU

SH

RD

ata

PU

SH

RSTCLK

D

Single ReadRequest

Single ReadRequest

Non-PostedWrite

Request

Single WriteRequest

Single ReadResponse

Single ReadResponse

Non-PostedWrite

Response

Page 19: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

19

How protocol transducer is realized

Intuitive understanding of the problem Follow the two protocols ⇒ compute the product

of the two FSM/automata and follow it

ProtocolA

Master

ProtocolB

Slave

Request Request

Response Response

Target

Exploration[1] + ours

Definition of protocol

Protocol A

Protocol BProtocol transducerIn FSM/automaton

(stb==1)ack<=0

ack<=1

ack<=0Clock-wisebehavior

[1] R.Passerone, J.A.Rowson, A.Sangiovanni-Vincentelli,“Automatic Transducer Synthesis of Interfaces between Incompatible Protocols” ,DAC’98 pp.8-13

Page 20: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

20

: Dependency violation !

Simple computation of product

Follow the two automata Compute the product of the two

Eliminate dependency violated nodes/paths

A D

A E B D B E

B FC D C E C F

C E C F

C F

C

B

A

F

E

D

ProtocolA

Master 8

ctrl

DI

ProtocolB

Slave8

ctrl

DO

{Ctrl=0}

{Ctrl=1, DI:=data1}{Ctrl=1, D:=data1}

{Ctrl=0, DI=data2}

{Ctrl=0, D:=data2}

Transducer

Transducer

{Ctrl=0}{Ctrl=1, Rcvf1:=DO}

{Ctrl=0, Rcvd2:=DO}

{Ctrl=0}

Data not yet received but sent

B→(B or C) E→F

A→(A or B) 、D→(D or E) (Transducer)

Minimum latency path !

8

8

Data not yet received but sent

[1]

Page 21: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

21

Need separation between control and datapath

If there is a loop in automata, product computation never terminates Data values are different each time going through

the loop

A D

B D

C E

C F

C

B

A

F

E

D

ProtocolA

Master 8

ctrl

DI

ProtocolB

Slave8

ctrl

DO

{Ctrl=0}

{Ctrl=1, DI:=data1}{Ctrl=1, D:=data1}

{Ctrl=0, DI=data2}

{Ctrl=0, D:=data2}

Transducer

Transducer

{Ctrl=0}{Ctrl=1, Rcvf1:=DO}

{Ctrl=0, Rcvd2:=DO}

{Ctrl=0}

8

8

A D

Data values are different

These two are not the same states

Need to expand more and more…

Page 22: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

22

Protocols can be very complicated State-of-the-art protocols introduces many

features for faster throughputs

ProtocolMaster

ProtocolSlave

Request(Address / Data)

Response(Data)

t

Split transaction( Non blocking)

Req1 t

Out of ordertransaction

Req2

Req3 Res1

Res2

Req1

Req2

Req3

Res1

Res3

Res2

Bursttransaction

t

Addr1

Addr2

Addr3

Data1

Data2

Data3

Data4Addr4

Request

Single addressBurst trans.

Addr1 Data1

Data2

Data3

Data4

Requestt

Req1 → Res1

Req2 → Res2

t Blocking( Low throughput)

Page 23: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

23

Problems and solutions

Simple product computation has essential problems for realistic on-chip bus protocols

If there is a loop in control, no termination If automata become large, may not terminate practically Protocol must be represented in a automaton

Cannot deal with non-blocking type protocolsThe above problems come from the non-separation between control an

d datapath Solutions: With separation of computation and communication

(control and datapath), the followings can be realized Hiding loops Protocols are represented hierarchically with automata

Page 24: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

24

Separation of communication and computation (data transfer)

Data values are abstracted away Only data id is watched in communication part Actual data transfer is realized by computation part Id matching is guaranteed by agreement between

computation and communication New request is accepted only after the previous

request has been accepted If necessary FIFO (buffer) is inserted to keep

not-yet-serviced sequences There can be multiple and simultaneous responses

may be coming before finishing the current response

Page 25: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

25

Separation of computation and communication inside protocol transducers

In protocol definition, control and data are separately specified

Introduce two FSMs for request and control to describe complicated protocols uniformly

FIFO can be made arbitrary complicated if we like

ProtocolA

Master

ProtocolB

SlaveRes. Res.

FSM

Transducer

ProtocolA

Master

ProtocolB

Slave

Req.

Res. Res.

Transducer

Res.FSM

Even arithmetic computation

possible

Req.

Req.FSM

Page 26: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

26

Protocols can be very complicated State-of-the-art protocols introduces many

features for faster throughputs

ProtocolMaster

ProtocolSlave

Request(Address / Data)

Response(Data)

t

Split transaction( Non blocking)

Req1 t

Out of ordertransaction

Req2

Req3 Res1

Res2

Req1

Req2

Req3

Res1

Res3

Res2

Bursttransaction

t

Addr1

Addr2

Addr3

Data1

Data2

Data3

Data4Addr4

Request

Single addressBurst trans.

Addr1 Data1

Data2

Data3

Data4

Requestt

Req1 → Res1

Req2 → Res2

t Blocking( Low throughput)

Page 27: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

27

For more complicated protocols…

Protocol definition

Protocol A Protocol B

Req.

Res.

Req.

Res.

ProtocolA

Master

ProtocolB

Slave

Req.

Res. Res.SendFSM

Req.

Req.FSM

RecvFSM

X ReqReq

X FIFOWR

ResXFIFORD

Res

Newly introduced FIFO

Transducer

Pros: Can deal with more complicated protocols

Cons: Need more latency delay due to multiple FIFO

Control for FIFO

Read Write

Page 28: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

28

Now we can resolve it

Elimination of loops (to initial states)。

Elimination of intermediate loops

i

A

B

i

C

D

i

A

B

i

C

D

e

e

Exp

lora

tion

i

Y

e

X

Z

U

W

i

YX

Z

U

WIntroductionof ending state

Eliminationpf ending states

SS = Loops are replaced with super states

Exp

lora

tion

Exp

lora

tion

[2] S.Watanabe, K.Seto, Y.Ishikawa, S.Komatsu, M.Fujita, “Protocol Transducer Synthesis using Divide and Conquer approach, “ Proc. of the 12th. Asia and South Pacific Design Automation Conference, pp.280-285, 2007.

Concentrating on controls only Date parts are processed separately !

[2]

Page 29: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

29

How to deal with multiple complicated transactions

A protocol is a collection of sequences Each sequence can operate independently

True for state-of-the-art protocols with separation between computation and communication

Protocol

Sequence1

Sequence2

Sequence3

Sequence4

Hardwaredefinition

( Read)

(Write)

( 4 burst read)

( 4 burst write)

Automaton1

Port, signal names, etc.

Automaton2

For request orblocking

For response

i(stb==1)ack<=0

ack<=1

ack<=0

All sequences share initial state

・・・

[2]

Page 30: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

30

Hierarchical synthesis owing to comp. and comm. separation

ProtocolA

ProtocolB

Transducer

Partial transducer1

Partial transducer2

SequenceA2

SequenceB1

SequenceB2

グラフ探索

グラフ探索

SequenceA1

Exploration

Exploration

ii

i

+ =

Merge generated FSM with the same initial state

Sequence level synthesis followed by merge process

[2]

Page 31: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

31

The protocol transducer synthesis (1)

Transducers including blocking protocols

ProtocolA

Master

ProtocolB

Slave

Req. Req.

Res. Res.

FSM

Transducer

i

i

Blocking protocol

Automaton level

synthesis

i

i

Req.

Res.

Non-blocking protocol (out of order)

Generate blocking automaton by composition

i

Automata for other sequences

Compose

Page 32: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

32

The protocol transducer synthesis (2) Out-of-order to out-of-order → Out of order processing

Tags are sent out as they are Non-blocking and out-of-order → In order processing

Transducer generates tags and reorders responses

i iRes.Req.

i i

Res.Req.

Pro

toco

l A

Pro

toco

l B

Automaton levelsynthesis

Automaton levelsynthesis

ProtocolA

Master

ProtocolB

Slave

Req.

Res. Res.

Transducer

Res.FSM

FIFO memorizes sequences whose responses are not

yet received

Req.

Req.FSM

i iRes.Req. Protocol

transducer

Page 33: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

33

The protocol transducer synthesis (3)

In state-of-the-art on-chip bus protocols: All masters have waiting mechanisms for request Some slaves do not have waiting mechanisms for

responses Ex  OCP

AutomatonLevel

synthesis

Restrictions on protocol definitions:Master does not have waiting mechanisms but slave has (request)Slave does not have waiting mechanisms but master has (response)

Next transaction may start before transducer returns to initial state→Some requests/responses may not be processed

OCP request(Read sequence)Wait with SCmdAccept signal

OCP response (Read sequence)Finish in exactly one cycle (no waiting mechanisms)

Page 34: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

34

The protocol transducer synthesis (4) Responses guaranteed to be processed with FIFO

Protocol definition

Protocol A Protocol B

Req.

Res.

Req.

Res.

ProtocolA

Master

ProtocolB

Slave

Req.

Res. Res.SendFSM

Req.

Req.FSM

RecvFSM

X ReqReq

Automaton level synthesis

X FIFOWR

ResXFIFORD

Res

Automaton level synthesis with FIFO control automaton (no waiting)

Newly introduced

FIFOTransducer

Pros: Can deal with more complicated protocols

Cons: Need more latency delay due to multiple FIFO

Automaton controlling FIFO

Read Write

Page 35: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

35

Tool implementation

Planned to be distributed freely from OCP-IP Currently under evaluation at Toshiba

Page 36: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

36

Experimental results Atholon64 2GH z  + 1GB RAM Implemented as over 12,000 loc in C++

Input: Hierarchical automaton descriptions in XML Output: RTL synthesizable Verilog

Logic synthesis: Xilinx ISE RTL simulator: Model Sim XE

Mater'sProtocol

Slave'sProtocol

Type Sequences Synth.Time Gate counts

OCP AHB (NB,BK) 4 1.1[s] 2,352

AHB OCP (BK,NB) 4 1.3[s] 1,843

OCP OCP (NB,NB) 2 1.9[s] 1,568

OCPTagged

OCP(NB,OoO) 2 2.2[s] 3,514

Tagged OCP

AXI (OoO,OoO) 2 4.8[s] 1,377

AXI OCP (OoO,NB) 2 4.9[s] 1,731

OCP AXI (NB,OoO) 26 257.8[s] 61,205

No one has ever synthesized !

Page 37: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

37

Rectification after manufacturing Transducer FSM can be implemented with

programmable devices Make run time change of protocols possible

FSMFor Sequence C

FSMFor Sequence B

FSMFor Sequence A

DefaultOutput Value

Reg. File

ProcessingElement

1

Pro

toco

l A

ProcessingElement

2

Pro

toco

l B

Protocol

Sequence1

Sequence2

Sequence3

Sequence4

Hardwaredefinition

( Read)

(Write)

( 4 burst read)

( 4 burst write)

Port, signal names, etc.

・・・

Page 38: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

38

Conclusion

The following have been shown through an example: protocol transducer synthesis Separation of concerns is essential

Hierarchical definition of protocol Complete separation between computation and

communication State-of-the-art protocols can be processed

efficiently Even formal verification becomes possible Rectification after manufacturing can be

handled

Page 39: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

39

Future issues

Bit-width conversion Ex: 16-bit write →  2 of 8-bit write Need ways to compose multiple sequences

Dynamic change of transfer times in burst mode Use super states

SequenceWrite8bit

SequenceWrite8bit

Composition SequenceWrite8bit*2

SequenceWrite16bit

Existing methods

SSSeparate the data transfer part

Determine loop count

Repeat super state by loop count

Page 40: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

Application to dynamically reconfigurable

processors/protocol transducers

Page 41: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

41

Hardware OS

Portions to be reconfigured in dynamically reconfigurable architectures Load and unload functional blocks dynamically Schedule functional blocks dynamically Communicate among functional blocks

Hardware OS(Operating System) Self (partial) reconfiguration on FPGA Load and unload circuit blocks (hardware tasks)

Just like “processes” in multi task software Provide ways to communicate among hardware tasks

Page 42: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

42

Example of hardware OS

Herbert et al. Task slot: Rectangle areas to load hardware

tasks Interconnect: Shared bus for communications OS module: Scheduling and loading hardware

tasks

OS

mod

ule

Tas

k sl

ot

Tas

k sl

ot

Tas

k sl

ot

Tas

k sl

ot

Interconnect

Hardware task

Circ

uit

bloc

k

ロード

Circuit block

Herbert Walder, Marco Platzner, “Reconfigurable Hardware Operating Systems: From Design Concepts to Realizations,”Proceedings of ERSA’03 pp.284-287, 2003

Dynamically reconfigurable

Page 43: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

43

Interconnect

Various topologies have been proposed

Assuming all functional blocks use the same protocols Not in general and need protocol transducers

FB1 FB2

FB3 FB4

FB1 FB2

FB1 FB2

SWBOX

SWBOX

SWBOX

SWBOX

FB FB

FB FB

S

FB FB

FB FB

S

SWBOX

a) Shared bus b) Mesh network c) Tree netowrk

Page 44: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

44

How to build protocol trasnducer

Proposal: Dynamically reconfigurable protocol transducers

Optimizing protocol transducers Universal protocol transducer for ( {A,

D}⇔{B, C}) is simply too complicated and hardware resource consuming

Load minimum protocol transducers dynamically Save hardware resources

IP1Protocol

A

IP2Protocol

BA to B

Reconf.IP1

ProtocolA

IP3Protocol

C

Reconf.

A to CIP4

ProtocolD

IP3Protocol

CD to C

Page 45: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

45

Basic idea: Use our protocol transducer synthesis method

Selecting partial protocol transducers dynamically

Our SynthesisMethod

Partial trnsdcr 1

Partial trnsdcr 2

Partial trnsdcr 3

Partial trnsdcr 4

Partial trnsdcr 5

Design phase

Run timeIP1

ProtocolA

IP2Protocol

B

A to BPartial trnsdcr 1

Partial trnsdcr 3

Selection from library

ProtocoltransducerCompose

(in case of static atchitecures )

Partial trnsdcr 2

Partial trnsdcr 4A to C

Page 46: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

46

Architecture of dynamically reconfigure protocol transducer

Place functional blocks and partial protocol transducers in task slots Like hardware OS Partial protocol transducers are dynamically

loaded and unloaded

Func.block

Shared bus

PartialTrnsdcr

PartialTrnsdcr

Func.block

Func.block

PartialTrnsdcr

Direct communication when protocols match

Through protocol transducer when protocols do not match

Dynamically loaded when necessary

Task slot

Page 47: A hardware-software co-design approach with separated verification/synthesis between computation and communication Masahiro Fujita VLSI Design and Education

47

Dynamic reconfiguration of protocol transducers

When loading functional blocks, their partial protocol transducers are also loaded

Unload non-in-use partial protocol transducers

Func.block

Func.block

Func.block

PartialTrnsdcr

Func.block

Required conv.

A→C:WriteA→C:Read

Functionalblocklibrary

Partialtransducer

library

PartialTrnsdcr

A→C:Read

LoadLoad

PlacePlace

Search