23
technische universiteit eindhoven ‘Nothing is built on stone; all is built on sand, but we must build as if the sand were stone.’ Jorge Luis Borges (Argentine writer 1899- 1986) Department of Electrical Engineering Electronic Systems Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

  • Upload
    chione

  • View
    36

  • Download
    3

Embed Size (px)

DESCRIPTION

Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010. Outline. We will look at models for Area, Delay and Energy Processor structure Register files - Register cell Model (area, power, delay) details for several register file configurations - PowerPoint PPT Presentation

Citation preview

Page 1: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

technische universiteit eindhoven

‘Nothing is built on stone; all is built on sand, but we must build as if the sand were stone.’

Jorge Luis Borges (Argentine writer 1899-1986)

Department of Electrical EngineeringElectronic Systems

Modeling of Architectures

Platform-based Design5KK70

Henk CorporaalBart Mesman

Hamed Fatemi2010

Page 2: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

2

Platform-based Design 5KK70 Electronic Systems

Outline

• We will look at models for Area, Delay and Energy

• Processor structure

• Register files - Register cell

• Model (area, power, delay)

• details for several register file configurations

• Apply this to the Imagine architecture

• Stream register file

• Network

Page 3: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

3

Platform-based Design 5KK70 Electronic Systems

Processor

• Single processor• Instruction Memory (IM)

• Controller

• Processing Element (PE)• Register File (RF)• ALU• Data Memory (DM)

• SIMD• Multiple PEs

• VLIW• Multiple ALUs

•Multi-Processor• Several processors

• Connected by a bus or network

IM

Controller

RF ALU DM

Network

PE

Page 4: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

4

Platform-based Design 5KK70 Electronic Systems

Register File (RF) Area model

• Assume:• p = number of ports

• For large RF row decoder small compared to cell area

• 1-Bit area = w*h (tracks)

Schematic of 1 register cell

)ph)(pw()p(Acell

If p is large 2p)p(A

1-bit

Page 5: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

5

Platform-based Design 5KK70 Electronic Systems

Register file (RF) Delay model

2

1

2

1

2

1

))(())(( pRbRpwbRpwd

Delay (d):

• Wire Propagation delay

• Fan-in/out delay

• Wire propagation dominates the delay with a large number of ports

• R = number of registers

Register file - assuming square layout- R registers of b bits

Note: for N FUs (ALUs), p ~ 3N, R ~ N → d ~ N3/2

Page 6: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

6

Platform-based Design 5KK70 Electronic Systems

Register file (RF) Power model

Register file

• Power (P):• Proportional to the capacitance that

must be switched for each access

• In each access every bit-line and one word-line bit-line capacitance

• Each port drives (bR)1/2 bit lines

• Each bit line has length (h+p) (bR)1/2

wport CphbRP )(1

If p is large: power is dominated by wire capacitance

2_ RpP portsp

Note: for N FUs (ALUs), p ~ 3N, R ~ N → P ~ N3

Page 7: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

7

Platform-based Design 5KK70 Electronic Systems

Register File organization

• Processor with one level register

Central (shared register file)

DRF (distributed register file):

ALU 1 ALU N

ALU 1 ALU N

Page 8: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

8

Platform-based Design 5KK70 Electronic Systems

Comparing Area model of Central and Distributed RF

2NA

3NA

Central (shared) RF:

•2 read ports, one write port per ALU

•R= rN: number of registers of b bits

•r: number of register per ALU

•N: number of ALUs

DRF:

•Only 2 ports: one read, one write

•This would give A(1 RF) ~ N

•Area of switch has same area cost complexity

)]3)(3[( wNhNrNbA

Square layout & organization

of the DRF, including 2N*N crossbar

Page 9: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

9

Platform-based Design 5KK70 Electronic Systems

Delay and Power models of central versus distributed RF

Assume N ALUs

• Central RF:• #registers R=rN

• #ports p =3N

• Large N

• DRF:• Constant #registers per ALU

• #ports p=2 (also constant!)

• DRF has a fixed delay and power (per RF)

• Wire propagation determines delay and power (for large N)

• For large N

3

2

3

NP

Nd

2NP

Nd

Page 10: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

10

Platform-based Design 5KK70 Electronic Systems

Register File

Register (memory) storage and

communication between ALUs are

critical parts for area, energy and

performance in media processor.

Hierarchical register storage

Page 11: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

11

Platform-based Design 5KK70 Electronic Systems

2-levels register files (Hierarchical)

Central:

• RF1 serves the ALUs, while RF2 is used to cover the memory latency• Overall tendency for Area is the same as having one level RF

ALU 1 ALU N

RF2 (level 2)

RF1 (level 1)

DRF:

ALU 1 ALU N

RF2 (level 2)

RF1 (level 1)

Page 12: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

12

Platform-based Design 5KK70 Electronic Systems

Register Files

• Processor with stream register files:• Replace each port into the memory staging RF with a stream buffer

• All stream buffers share a single port into the memory staging RF, allowing that single physical port to act as many logical ports.

Central:

ALU 1 ALU N

Page 13: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

13

Platform-based Design 5KK70 Electronic Systems

Register Files

DRF:

• The payoff the transformation into a stream architecture is that we can achieve an area proportional to N^2, since R2 (memory storage) only needs 1 port. We also have to add in the area of the stream buffers, which grows as N^2 with a very small constant.

ALU 1 ALU N

Page 14: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

14

Platform-based Design 5KK70 Electronic Systems

Results

are

a p

er

ALU

(Norm

aliz

ed t

o 1

A

LU)

Page 15: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

15

Platform-based Design 5KK70 Electronic Systems

Results

Loca

l dela

y

Page 16: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

16

Platform-based Design 5KK70 Electronic Systems

Results

Pow

er

overh

ead

Page 17: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

17

Platform-based Design 5KK70 Electronic Systems

Imagine Architecture

Die Photo of Imagine Cell placement of Imagine

Page 18: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

18

Platform-based Design 5KK70 Electronic Systems

Imagine Floorplan

• 22 million transistors

• 500 MHz

• Area, Energy, Delay models

• Clusters, Micro-controller, SRF, Network Interface

Micro-Controller

ALU Cluster 0

ALU Cluster 1

ALU Cluster 2

ALU Cluster 3

ALU Cluster 4

ALU Cluster 5

ALU Cluster 6

ALU Cluster 7

SR

F

Mem

ory S

ystem

StreamController

NetworkInterface

7.8m

m

7.6mm

Page 19: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

19

Platform-based Design 5KK70 Electronic Systems

Stream register File

Page 20: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

20

Platform-based Design 5KK70 Electronic Systems

Network:

• Area of network grows with (like DRF switch) :

clustes of numberN

NA

c

2c

commclustermicroSRFtotal

commclustermicroSRFtotal

EEAECAE

ACAACAA

clustes of numberC

CA 2

More details in khailany paper [2003]

Page 21: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

21

Platform-based Design 5KK70 Electronic Systems

Exploration

Intra-cluster scaling

Page 22: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

22

Platform-based Design 5KK70 Electronic Systems

Exploration

Inter-cluster scaling

Page 23: Modeling of Architectures Platform-based Design 5KK70 Henk Corporaal Bart Mesman Hamed Fatemi 2010

23

Platform-based Design 5KK70 Electronic Systems

end

• More details:• Scott Rixner, William J. Dally, Brucek Khailany, Peter Mattson,

Ujval J.Kapasi, and John D. Owens. Register Organization for Media Processing. In Proceedings of the 6th International Symposium on High-Performance Computer Architecture (HPCA), pages 375–386, Toulouse, France, January 2000. IEEE Computer Society.

• Brucek Khailany, William Dally, Scott Rixner, Ujval Kapasi, John Owens, and Brian Towles. Exploring the vlsi scalability of stream processors. In Proceedings of the Ninth Symposium on High Performance Computer Architecture (HPCA), pages 153–164, Anaheim, California, USA, February 2003. IEEE Computer Society.