22
1 Homework • Reading – None (Finish all previous reading assignments) • Machine Projects – Continue with MP5 • Labs – Finish lab reports by deadline posted in lab

Not bridge south bridge archexture

Embed Size (px)

Citation preview

Page 1: Not bridge  south bridge archexture

1

Homework

• Reading– None (Finish all previous reading assignments)

• Machine Projects– Continue with MP5

• Labs– Finish lab reports by deadline posted in lab

Page 2: Not bridge  south bridge archexture

2

Hierarchy for 80286 Memory and I/O

• IBM PC-AT (“Advanced Technology” in 1984)

• DOS 3.0 Operating System

• PC-AT bus evolved into Industry Standard Bus

• Many manufacturers built ISA-based PCs/cards

• ISA Bus– Slow 6 MHz evolved to 8 MHz or 125 nsecs/cycle– Address Bus 20 bits– Data Bus 16 bits

Page 3: Not bridge  south bridge archexture

3

IBM PC-AT

Reference: http://www.vintage-computer.com/ibmpcat.shtml

Page 4: Not bridge  south bridge archexture

4

Big Picture (80286)

RTC Keyboard SerialPort

ParallelPort

FloppyDisk

80286

RAMMemory

ROMMemory

ISA Bus: 20/16 bits, 8 MHz (125 nsecs/cycle)

HardDisk

Page 5: Not bridge  south bridge archexture

5

Hierarchy for 80486 Memory and I/O

• CPU Clock: 66 MHz

• Local Bus or CPU Bus– “Fast” 33MHz / 32 bits wide

• Expansion Bus Controller (CPU-ISA Bridge)

• ISA Bus (Legacy)– “Slow” 8 MHz or 125 nsecs/cycle– Address Bus 20 bits– Data Bus 16 bits

Page 6: Not bridge  south bridge archexture

6

Big Picture (80486)

CPU

Local bus or CPU bus: fast (33 MHz, 32 bits) [30 nsec./cycle]

Memory CacheVideo

AdapterDisk

ExpansionBus

Controller

RTC

ISA bus: slow (8 MHz, 8/16 bits) [125 nsec./cycle]

KeyboardSerialPort

ParallelPort

FloppyDisk

SystemROM

Page 7: Not bridge  south bridge archexture

7

Competition for ISA replacement

• Many vendors proposed busses to replace ISA as the technology improved– IBM: Micro Channel Architecture (MCA)– Extended Industry Standard Architecture (EISA)– VESA Local Bus– Intel: Peripheral Component Interconnect (PCI)

• PCI had won commercial battle by mid-90’s

• For a while PCs had a mix of ISA and PCI slots

Page 8: Not bridge  south bridge archexture

8

Hierarchy for Pentium 4 Memory and I/O

• CPU Clock Speed– “Fast” 2 – 2.5 GHz

• CPU “Front End” Bus Speed– “Fast” 533 MHz / 64 bits wide evolved to 800 MHz

• CPU-PCI Bridge (“North Bridge”)• PCI Bus (Most prevalent peripheral bus after ISA)

– “Medium Speed”: 33 or 66 MHz / 32 or 64 bits wide

• PCI-ISA Bridge (“South Bridge”)• ISA Bus (Most prevalent “Legacy” peripheral bus)

– “Slow Speed”: 8 Mhz / 20 and 16 Bits

Page 9: Not bridge  south bridge archexture

9

The Big Picture (Pentium)Pentium

CPU

CPU bus: fast (100 MHz, 64 bits) [10 nsec./cycle]

MemoryCache

VideoAdapter

SystemROM

ExpansionBus

Controller

RTC

ISA bus: slow (8 MHz, 8/16 bits) [125 nsec./cycle]

KeyboardSerialPort

ParallelPort

FloppyDisk

PCIController

PCI bus: fast (33 MHz, 32/64 bits) [30 nsec./cycle]

Disk

“South Bridge”

“North Bridge”

Page 10: Not bridge  south bridge archexture

10

Motherboard Chipsets

• The motherboard chip set provides the core logic and manages the motherboard's functions.

• Several companies (including ATI, Intel, and nVidia) make motherboard chip sets, most of which offer the same basic features.

• The variants of nVidia's nForce4 chip set were the most widely used on the boards though Intel's 975X Express has become increasingly popular for Intel-based motherboards.

Page 11: Not bridge  south bridge archexture

11

North Bridge

ExpansionBus

Controller(CPU-PCI)

“North Bridge”

M/IO#

D/C#

W/R#

AEN#

A31-A3

BE7# - BE0#

CLK

BRDY#

CPU Bus PCI Bus

AD[31:0]

C/BE#[3:0]

FRAME#

TRDY#

IRDY#

STOP#

REQ#

GNT#

D31-D0

Page 12: Not bridge  south bridge archexture

12

South Bridge

ExpansionBus

Controller(PCI-ISA)

“South Bridge”

CLK

MEMR#

MEMW#

IOR#

IOW#

INTA#

A23-A0

PCI Bus ISA Bus

AD[31:0]

C/BE#[3:0]

FRAME#

TRDY#

IRDY#

STOP#

REQ#

GNT#

D23-D0

Page 13: Not bridge  south bridge archexture

13

Pentium 4 CPU Specifications

• The Pentium 4 Processor– Introduced: May 6, 2002 – 512KB level-two cache– Operating at 533 MHz “front side bus” speed– Available now at 2.53 GHz, 2.4 GHz and 2.26 GHz

and is priced at $637, $562 and $423, respectively, in 1,000-unit quantities.

– Benchmarks: SPECint*_base2000 score of 882 SPECfp*_base2000 score of 860

– "Springdale" will have a FSB speed of 800 MHz

Page 14: Not bridge  south bridge archexture

14

Pentium CPU Block Diagram

Page 15: Not bridge  south bridge archexture

15

Enhancing Performance• “Pipelining is an implementation technique in which

multiple instructions are overlapped in execution”, (Patterson and Hennessey, “Computer Organization and Design”, p. 436)

Sequential Execution:

Pipelined Execution:

RegRead

ALUOperation

InstructionFetch

DataMemory

RegWriteLoad Word

8 nsRegRead

ALUOperation

InstructionFetch

DataMemory

RegWrite

8 nsRegRead

InstructionFetch

Load Word

Load Word

RegRead

ALUOperation

InstructionFetch

DataAccess

RegWriteLoad Word

2 nsRegRead

ALUOperation

InstructionFetch

DataMemory

RegWrite

RegRead

InstructionFetch

Load Word

Load Word ALUOperation

DataMemory

RegWrite

2 ns

2 ns

Page 16: Not bridge  south bridge archexture

16

Pipeline Example (Cont’d)• Pipelining improves the overall performance by

increasing instruction throughput per unit time not decreasing execution time of an individual instruction

• Ideal speedup is number of stages in the pipeline

• Do we achieve this? Sometimes / Not always– Notice the idle time in the pipe at certain times– Flushing pipeline during conditional jumps– Data being calculated by previous instruction may be

needed too early in the next instruction

Page 17: Not bridge  south bridge archexture

17

Superscalar Processors

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

IF OF EX OS

0 1 2 3 4 5 6 7 8 9 Time in Base Cycles

• More than one execution pipeline executing in parallel

• Note: Possible coordination problems must be resolved

Page 18: Not bridge  south bridge archexture

18

Custom Digital Signal Processor

• Application is hard real time system – processing analog modem waveforms

• Computations based on complex arithmetic

• Rotation of a vector (e.g. carrier frequency)– Complex multiply

• Filtering of a sequence of signal samples– Loop doing complex multiply and addition

Page 19: Not bridge  south bridge archexture

19

Digital Signal Processor ArchitectureSignal Processing Controller (SPC)

Fetch and Execute Instructions

Multiplier-Accumulator-Ram(Real)

Multiplier-Accumulator-Ram(Imaginary)

Data Bus

Real MAR Chip Select Imaginary MAR Chip Select

Address/Modulo Registers

ALU ALUMemoryMemory

SPCProgramMemoryR0

R1N0N1

ControllingHost

Processor(68000)

Analog/DigitalConverter

FromPhone Line

Digital/AnalogConverter

ToPhone Line

Address Bus

Page 20: Not bridge  south bridge archexture

20

Addresses and Complex Data

• Addresses stored in SPC Registers– Pointers to complex data in MAR memories– Register post increment modes:

Increment by oneDecrement by oneIncrement by one modulo specified N registerDecrement by one modulo specified N register

• Complex numbers stored in MAR memories:– Real part in one MAR (Real MAR)– Imaginary part in other MAR (Imaginary MAR)

Page 21: Not bridge  south bridge archexture

21

Multiplier-Accumulator-RAM

X-Register Y-RegisterMultiplier

Adder

256 Words of RAM Memory

Accumulator

MPY MAC

To Other MARFrom Other MAR

Address from SPC

Selector

Chip SelectFrom SPC

Selector

Page 22: Not bridge  south bridge archexture

22

SPC/MAR Assembly Programming

• Complex Multiply(RR0 * RR1 – IR0 * IR1) + i (RR0 * IR1 + RR1 * IR0)

YPP.P MP.R1 Load both Y from own memory at R1 address

MPY.P MR.R0 Multiply both with real memory at R0 address

YMM.R MI.R1 Load minus real Y from imag memory at R1 address

YPP.I MR.R1 Load imag Y from real memory at R1 address

MAC.P MI.R0 Multiply/add both with imag memory at R0 address

NOP Result not ready yet: NOP or housekeeping

NOP Result not ready yet: NOP or housekeeping

STA.P MP.R0 Store each acc. in own memory at R0 address