41
elementary µprocessor Nabil Chouba

Elementary µprocessor tutorial

Embed Size (px)

DESCRIPTION

elementary µprocessor

Citation preview

Page 1: Elementary µprocessor tutorial

elementary µprocessorNabil Chouba

Page 2: Elementary µprocessor tutorial

- Scientists felt that IC was not yet ready to support computer on a chip

- Japan Busicom & Intel : design a calculator 12 chip.- Ted Hoff decided that Intel could build it in one chip:

- CPU, ROM and RAM- Ted Hoff designed the architecture- Federico Faggin the logic and physical design (next found Zilog)- Stan wrote the application programs

=> The first commercially microprocessor : Intel 4004

4-bit CPU, 16bit instructions, 12bit PC, 2200 transistors

The Beginning : 1969

Page 3: Elementary µprocessor tutorial

Instruction set

Opcode Address 15 8 7 0

ADD Address

XOR Address

STORE Address

LOAD Address

JUMP Address

JUMPZ Address

Our elementary processor :

Machine coding 00000001a7a6a5a4a3a2a1a0 ACC ← ACC + Mem[Address]

ACC ← ACC xor Mem[Address]Mem[Address] ← ACCACC ← Mem[Address] PC ← Address

If ACC = 0 then PC← Address

00000010a7a6a5a4a3a2a1a0

00000011a7a6a5a4a3a2a1a0

00000100a7a6a5a4a3a2a1a0

00000101a7a6a5a4a3a2a1a0

Assembly language Functionality description

00000110a7a6a5a4a3a2a1a0

Instruction register IR:

Page 4: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

IR

AL

U

Memory

A (address)

D (dataIn)

Q (dataOut)

Datapath : Main Components

Page 5: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

+1

mux

AL

U

Memory

A (address)

D (dataIn)

Q (dataOut)

Datapath : JUMP, Next inst

JUMP : PC ← Address

Address Opcode 0 7 8 15

PC ← PC + 1IR

Page 6: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

+1

mux

AL

U

Memory

A (address)

D (dataIn)

Q (dataOut)

Datapath : ADD,OR,LOAD

0 7 8 15

ADD ACC ← ACC + Mem[Address]OR ACC ← ACC xor Mem[Address]

LOAD ACC ← Mem[Address]

IR

mux

= Mem[Adrs]

Page 7: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

IR

+1

mux

AL

U

Memory

A (address)

D (dataIn)

Q (dataOut)

Datapath : STORESTORE Mem[Address] ← ACC

Page 8: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

+1

mux

AL

U

A (address)

D

Q (dataOut)

Datapath Fetch inst : Get instruction form the RAM

Van Neumann architectureRAM : data & Instruction

0: ADD 5…5: 520

1

MDR <= Mem[MAR]2

ADD 5

0

0

MAR <= PC1

2

IR

3

IR <= MDR3 ADD 5

Page 9: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

+1

mux

mux

A (address)

D

Q (dataOut)

Datapath

AL

U

Fetch Data : Get Data form the RAM

Adrs : 5 ADD 0 7 8 15

IR

MDR

0: ADD 5…5: 520

520

2

5

1

MAR<=IR1

MDR <= Mem[MAR]2

Page 10: Elementary µprocessor tutorial

PC

MAR

MDR

AC

C

+1

mux

mux

AL

U

CTR

loadMDR

loadACC

loadPC

loadMARopALU

muxPC

muxMAR

Memory

A (address)

D (dataIn)

Q (dataOut)

R/W

Datapath : Control Unit

muxACC

Address opcode 0 7 8 15

IR Opcode

loadIR

Zero Z

Page 11: Elementary µprocessor tutorial

ALU

+

xor

muxa

b

Result

b

Top Level RTL view Netlist: Gate view

ALU can be implemented using :Standard Cell or hard macro

Page 12: Elementary µprocessor tutorial

Memory : SRAM

Sense Amplifier

Pinout: A : address D : data input Q : data output R/W : Read / write

Feature : - Row address decoder - Sense amplifies - Column decoder/mux

Page 13: Elementary µprocessor tutorial

Fetch Next stage1

MAR ←PCPC ←PC+1

IR ←MDR

Fetch Next stage3

Decodeinstruction

decode (IR[15:8])MAR ←IR

Mem[MAR] ←MDR

ExecuteADD

ACC ←ACC + MDR

ExecuteXOR

ACC ←ACC OR MDR

ExecuteLOAD

ACC ←MDR

rst CTR: state machine Fetch Next

stage2

MDR ←Mem[MAR]

Execute

LOAD MDR

MDR ←Mem[MAR]

Opcode= ADD

Execute

LOAD MDR

MDR ←Mem[MAR]

Opcode= OR

Execute

LOAD MDR

MDR ←Mem[MAR]

Opcode= LOAD

ExecuteSTORE

Opcode= STORE

ExecuteJUMP

PC ←IR

Opcode= JUMP or JUMPZ & Z

Opcode= JUMPZ & !Z

Fetch_1

muxMAR ←’0’ muxPC ←’0’loadPC ←’1’loadMAR ←’1’

Fetch_2

R/W ←’0’loadMDR ←’1’

Fetch_2

loadIR ←’1’

Decode muxMAR ←’1’ loadMAR ←’1’

ExecADD_1

R/W ←’0’loadMDR ←’1’

ExecADD_2

loadACC ←’1’muxACC ←’0’opALU ←’1’

ExecOR_1

R/W ←’0’loadMDR ←’1’

ExecOR_2

loadACC ←’1’muxACC ←’0’opALU ←’0’

ExecLoad_1

R/W ←’0’loadMDR ←’1’

ExecLoad_1

loadACC ←’1’muxACC ←’1’

ExecStore_1

R/W ←’1’

ExecJump

muxPC ←’1’loadPC ←’1’

Page 14: Elementary µprocessor tutorial

rst CTR: state machine

Opcode= ADD

Opcode= OR Opcode= LOAD

Opcode= STOREOpcode= JUMP or JUMPZ & Z

Opcode= JUMPZ & !Z

muxMAR ←’0’ muxPC ←’0’loadPC ←’1’loadMAR ←’1’

Fetch_2

R/W ←’0’loadMDR ←’1’

Fetch_2

loadIR ←’1’

Decode muxMAR ←’1’ loadMAR ←’1’

ExecADD_1

R/W ←’0’loadMDR ←’1’

ExecADD_2

loadACC ←’1’muxACC ←’0’opALU ←’1’

ExecOR_1

R/W ←’0’loadMDR ←’1’

ExecOR_2

loadACC ←’1’muxACC ←’0’opALU ←’0’

ExecLoad_1

R/W ←’0’loadMDR ←’1’

ExecLoad_1

loadACC ←’1’muxACC ←’1’

ExecStore_1

R/W ←’1’

ExecJump

muxPC ←’1’loadPC ←’1’

Fetch_1

Page 15: Elementary µprocessor tutorial

VHDL – FSM: Next State Processing

--next state processing combinatory_FSM_next : process(state_reg,opcode,zflag) begin state_next<= state_reg; case state_reg is when Fetch_1 => state_next <= Fetch_2; when Fetch_2 => state_next <= Fetch_3; when Fetch_3 => state_next <= decode;

when ExecADD_1 => state_next <= ExecADD_2; when ExecADD_2 = state_next <= Fetch_1; when ExecOR_1 => state_next <= ExecOR_2; when ExecOR_2 => state_next <= Fetch_1; when ExecLoad_1 => state_next <= ExecLoad_2; when ExecLoad_2 => state_next <= Fetch_1; when ExecStore_1 => state_next <= Fetch_1; when ExecJump => state_next <= Fetch_1; when others => end case; end process;

when decode => case opcode is when op_add => state_next <= ExecADD_1; when op_or => state_next <= ExecOR_1; when op_load => state_next <= ExecLoad_1; when op_store => state_next <= ExecStore_1; when op_jump => state_next <= ExecJump; when op_jumpz => state_next <= Fetch_1; if zflag = '1' then state_next <= ExecJump; end if; when others => end case;

Page 16: Elementary µprocessor tutorial

VHDL – FSM : Output Processing :

when ExecADD_1 => MemRW <= '0'; loadMDR<= '1';

when ExecADD_2 => loadACC <= '1'; muxACC <= '0'; opALU <= '1'; when ExecOR_1 => MemRW <= '0'; loadMDR<= '1';

when ExecOR_2 => loadACC <= '1'; muxACC <= '0'; opALU <= '0';

case state_reg is when Fetch_1 => muxMAR <= '0'; muxPC <= '0'; loadPC <= '1'; loadMAR<= '1';

when Fetch_2 => MemRW <= '0'; loadMDR<= '1';

when Fetch_3 => loadIR<= '1';

when decode => muxMAR <= '1'; loadMAR <= '1';

--output processing combinatory_outproc: process(state_reg) begin --defaut value muxPC <= '0'; muxMAR <= '0'; muxACC <= '0'; loadMAR <= '0'; loadPC <= '0'; loadACC <= '0'; loadMDR <= '0'; loadIR <= '0'; opALU <= '0'; MemRW <= '0';

when ExecLoad_1 =>

MemRW <= '0'; loadMDR<= '1';

when ExecLoad_2 => loadACC <= '1'; muxACC <= '1'; when ExecStore_1 => MemRW <= '1';

when ExecJump => muxPC <= '1'; loadPC <= '1';

when others => end case; end process;

Page 17: Elementary µprocessor tutorial

Synthesis– CTR – Gate view

* synthesis done by Synplify tool

Page 18: Elementary µprocessor tutorial

VHDL – Datapathu_alu : alu port map ( A => MDR_reg, B => ACC_reg,

opALU => opALU, Rout => ALU_out);

MDR_next <= MemQ when loadMDR = '1' else MDR_reg;

IR_next <= MDR_reg when loadIR = '1' else IR_reg;

ACC_next <= MDR_reg when loadACC = '1' and muxACC= '1' else ALU_out when loadACC = '1' and muxACC= '0' else

ACC_reg;

Page 19: Elementary µprocessor tutorial

VHDL – Datapath

PC_next <= IR_reg(15 downto 8) when loadPC = '1' and muxPC= '1' else PC_reg + 1 when loadPC = '1' and muxPC= '0' else

PC_reg;

MAR_next <= IR_reg(15 downto 8) when loadMAR = '1' and muxMAR= '1' else PC_reg when loadMAR = '1' and muxMAR= '0' else

MAR_reg;

MemAddr <= MAR_reg; MemD <= ACC_reg; opCode <= IR_reg(7 downto 0);

zflag <= '1' when ACC_reg = "0000000000000000" else '0';

Page 20: Elementary µprocessor tutorial

Synthesis– Datapath – RTL view

* synthesis done by Synplify tool

Page 21: Elementary µprocessor tutorial

ACC_reg & alu

MDR_reg & alu

zflag

MAR_reg

PC_reg (+1) IR_reg

Synthesis – Datapath – Gate view

* synthesis done by Synplify tool

Page 22: Elementary µprocessor tutorial

u_ctr : ctr port map (clk => clk, rst => rst, zflag => zflag, opCode => opCode, muxPC => muxPC, muxMAR => muxMAR, muxACC => muxACC, loadMAR=> loadMAR, loadPC => loadPC, loadACC=> loadACC, loadMDR=> loadMDR, loadIR => loadIR, opALU => opALU, MemRW => MemRW);

U_Memory: ram generic map ( d_width => 16, addr_width => 8, mem_depth => 256) PORT MAP ( clk => clk, we =>MemRW , d =>MemD , q =>MemQ , addr=>unsigned(MemAddr));

u_datapath : datapath port map (clk => clk, rst => rst, opCode => opCode, muxPC => muxPC, muxMAR => muxMAR, muxACC => muxACC, loadMAR=> loadMAR, loadPC => loadPC, loadACC=> loadACC, loadMDR=> loadMDR, loadIR => loadIR, zflag => zflag, opALU => opALU, MemAddr =>MemAddr, MemD =>MemD, MemQ =>MemQ);

VHDL – µpros : Top level

Page 23: Elementary µprocessor tutorial

Synthesis – µpros : Top level

* synthesis done by Synplify tool

Page 24: Elementary µprocessor tutorial

VHDL – ALU

entity alu is Port ( A : in STD_LOGIC_VECTOR (15 downto 0); B : in STD_LOGIC_VECTOR (15 downto 0); opALU : in STD_LOGIC; Rout : out STD_LOGIC_VECTOR (15 downto 0));end alu;

architecture RTL of alu isbeginRout <= A + B when opALU = '1' else A xor B;end RTL;

Page 25: Elementary µprocessor tutorial

Synthesis – ALU – RTL view

Page 26: Elementary µprocessor tutorial

VHDL – RAM – coding stylebegin wr_port : process ( clk ) begin if (clk'event and clk = '1') then if ( we = '1') then mem(conv_integer(addr)) <= d; end if; end if; end process wr_port ; q <= mem(conv_integer(addr)) ; end RAM_arch;

entity ram is generic( d_width : integer ; addr_width : integer ; mem_depth : integer ); port ( clk : in STD_LOGIC; we : in STD_LOGIC; d : in STD_LOGIC_VECTOR(d_width - 1 downto 0); q : out STD_LOGIC_VECTOR(d_width - 1 downto 0); addr : in unsigned(addr_width - 1 downto 0)); end ram; architecture RAM_arch of ram is

type mem_type is array (mem_depth - 1 downto 0) of STD_LOGIC_VECTOR (d_width - 1 downto 0); signal mem : mem_type ;

ASIC : (using specific generator) - .db file for synthesis - .hdl file for simulation

FPGA : - Already Supported by FPGA provider - Synthesis recognize specific coding style

Page 27: Elementary µprocessor tutorial

Synthesis– RAM – Gate view

* synthesis done by Synplify tool

Page 28: Elementary µprocessor tutorial

Synthesis – Resource Usage Report

Mapping to part: xc3s1500fg456-4Cell usage:• FDC 14 uses• FDCE 84 uses• FDP 1 use• GND 2 uses• MULT_AND 15 uses• MUXCY_L 22 uses• MUXF5 16 uses• RAM64X1S 64 uses• XORCY 23 uses• LUT2 1 use• LUT3 107 uses• LUT4 20 uses

I/O ports: 27

RAM/ROM usage summarySingle Port Rams (RAM64X1S): 64

Global Clock Buffers: 1 of 8 (12%)

Mapping Summary:Total LUTs: 384 (1%)

* synthesis done by Synplify tool

Page 29: Elementary µprocessor tutorial

Synthesis – Worst Slack

Requested Estimated Requested Estimated Clock Starting Clock Frequency Frequency Period Period Slack --------------------------------------------------------------------------------------------micorPross|clk 201.7 MHz 171.4 MHz 4.958 5.833 -0.875 =======================================================

Starting Points with Worst Slacku_ctr.state_reg_fast[6] u_datapath.ACC_reg_fast[0] u_datapath.ACC_reg[1] u_datapath.MDR_reg[1] u_datapath.MDR_reg_fast[0] u_datapath.ACC_reg[2] u_datapath.MDR_reg[2] u_datapath.ACC_reg[3] u_datapath.MDR_reg[3] u_datapath.ACC_reg[4]

Ending Points with Worst Slacku_datapath.ACC_reg[15] u_datapath.ACC_reg[14]u_datapath.ACC_reg[13]u_datapath.ACC_reg[12]u_datapath.ACC_reg[11]u_ctr.state_reg[11]u_datapath.ACC_reg[10]u_datapath.ACC_reg[9]u_datapath.PC_reg[0]u_datapath.PC_reg[1]

* synthesis done by Synplify tool

Page 30: Elementary µprocessor tutorial

Synthesis – Worst Path

Name Type Name Dir Delay Arrival Time Fan Out(s)------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

u_ctr.state_reg_fast[6] FDC Q Out 0.720 0.720 - opALU_fast Net - - 0.790 - 8 u_datapath.ACC_next_0_axb_0 LUT3 I0 In - 1.510 - u_datapath.ACC_next_0_axb_0 LUT3 O Out 0.579 2.089 - u_datapath.ACC_next_0_cry_0_0 MUX S In - 2.089 - u_datapath.ACC_next_0_cry_0_0 MUX LO Out 0.480 2.570 - ACC_next_0_cry_0 Net - - 0.000 - 2 ... ACC_cry 1..2..3..4..5..6..7..8..9..10......15 …u_datapath.ACC_next_0_s_15 XORCY CI In - 3.410 - u_datapath.ACC_next_0_s_15 XORCY O Out 0.883 4.293 - u_datapath.ACC_next_0[15] LUT3_L I2 In - 5.003 - u_datapath.ACC_next_0[15] LUT3_L LO Out 0.579 5.582 - ACC_next[15] Net - - 0.000 - 1 u_datapath.ACC_reg[15] FDCE D In - 5.582 - ========================================================================================================================

Total path delay (propagation time + setup) of 5.693 is 4.193(73.6%) logic and 1.500(26.4%) route.

* synthesis done by Synplify tool

Page 31: Elementary µprocessor tutorial

0:LOAD 111:ADD 122:STORE 133:XOR 10 4:JUMPZ 75:LOAD 136:JUMP 17:JUMP 7

10: 8 11: 212: 1

ACC ← Mem[11] ACC ← ACC + Mem[12]Mem[13] ← ACCACC ← ACC xor Mem[10]If ACC = 0 then PC← 7ACC ← Mem[13]PC ← 1PC ← 7

DataDataData

Assembly language Functionality description

0:00000011 00001011 1:00000001 000011002:00000100 00001101 3:00000010 000011000:00000011 00001011 1:00000001 000011000:00000011 00001011 1:00000001 000011000:00000011 00001011 1:00000001 00001100

for (i=2;i!=8;i++);for (;;);

Page 32: Elementary µprocessor tutorial

Overall Simulation

i=2i=3 i=4 i=5 i=6

i=7 i=8hold on

for (i=2;i!=8;i++); for (;;);

Page 33: Elementary µprocessor tutorial

Loop : 1

LOAD 11 ADD 12 STORE 13 XOR 10 JUMPZ 7 LOAD 13RST JUMP1

Page 34: Elementary µprocessor tutorial

LOAD 11

Page 35: Elementary µprocessor tutorial

ADD 12

Page 36: Elementary µprocessor tutorial

STORE 13

Page 37: Elementary µprocessor tutorial

XOR 10

Page 38: Elementary µprocessor tutorial

JUMPZ 7

Page 39: Elementary µprocessor tutorial

LOAD 13

Page 40: Elementary µprocessor tutorial

JUMP 1

Page 41: Elementary µprocessor tutorial

JUMPZ 7 & loop on JUMP 7