Upload
ragini-sundarraman
View
14
Download
5
Tags:
Embed Size (px)
DESCRIPTION
Manual for advanced microprocessors in engineering
Citation preview
Advanced Microprocessor
Experiment No. 01:
Study of internal components of CPU cabinet
AIM: To study the Internal Components of CPU Cabinet.
EQUIPMENT:
P IV 2 GHz, 512 MB RAM 40 GB HDD,
15 DELL Color Monitor optical Mouse,
Dot Matrix Printer(EPSON FX-2175).
THEORY:
Package Type:
DIP: (Dual -in-line package) :
(8086/88,Z80, 68000,68010 )
QFP (quad flat package ):
( IntelNG80386, PowerPC601)
PGA (Pin Grid Array ):
(Intel 386 DX ,Cyrix Cx486DLC AMD 486 DX, Intel 486 SX)
LCC(Leadless Chip Carrier):
(AMD R80186, Intel R80286-6, Siemens SAB 80188-R)
Department of computer Engineering, SIES Graduate School of Technology 1
Advanced Microprocessor
PLCC(Plastic Leaded Chip Carrier):
(AMD N80C186,Harris CS80C286-16, Cyrix CX-83S87-16-JP)
Slot Packages:
Intel Pentium III (Slot 1) Intel Celeron (Slot 1)
Motherboard
It is the main unit inside the cabinet on which all the components are mounted or to which are
connected. Mainly it is described according the processor slot/socket available on it.
Motherboard is of many types like AT, ATX, etc.
Processor slots/sockets:
Socket / Slot Pincount /
Type
Supported Processors
Socket 1 169
LIF/ZIF
PGA
Intel i486
AMD Am5x86 133 (w/ voltage adaptor)
Cyrix Cx5x86 100/120 (w/ voltage adaptor)
Socket 2 238
LIF/ZIF
PGA
Intel i486
Intel Pentium
AMD Am5x86 133 (w/ voltage adaptor)
Cyrix 5x86 100/120 (w/ voltage adaptor)
Socket 3 237
LIF/ZIF
PGA
Intel i486
Intel Pentium
AMD Am5x86 133
Cyrix 5x86 100/120
Department of computer Engineering, SIES Graduate School of Technology 2
Advanced Microprocessor
Socket 4 273
LIF/ZIF
PGA
Intel Pentium P5 60/66
Intel Pentium OverDrive 120/133
Socket 5 296/320
LIF/ZIF
SPGA
Intel Pentium P45C 75-133
Intel Pentium MMX P55 166-233
AMD K5 PR75-133
AMD K6 166-300
Cyrix 6x86L PR120-166 (w/ voltage adaptor)
Cyrix 6x86MX PR166-233 (w/ voltage adaptor)
IDT Winchip
Socket 6
(uncommon)
235
ZIF
PGA
Intel i486 DX4 75-120
Socket 7
Super 7
321
ZIF
SPGA
Intel Pentium P45C
Intel Pentium MMX P55
AMD K5 PR75-200
AMD K6
Cyrix 6x86
IDT Winchip
Socket 8 387
LIF/ZIF
PGA/SPGA
dual pattern
Intel Pentium Pro 150-200
Intel Pentium II
Slot 1 242
SECC
SECC2
SEPP
Intel Celeron
Intel Pentium Pro
Intel Pentium II
Intel Pentium III
Slot 2 330
SECC
Intel Pentium II Xeon 400/450 (Drake)
Intel Pentium III Xeon 500/550 (Tanner)
Intel Pentium III Xeon 600-1GHz (Cascades)
Department of computer Engineering, SIES Graduate School of Technology 3
Advanced Microprocessor
Socket 370 370
ZIF
SPGA
Intel Celeron
Intel Pentium III
Cyrix III 533-667 (Samuel)
Slot A 242
SECC
AMD Athlon 500-700 (K7)
AMD Athlon 550-1GHz (K75)
AMD Athlon 700-1GHz (Thunderbird)
Socket A 462
ZIF
SPGA
AMD Duron 600-950
AMDAthlon
AMD Sempron
Socket 423 423
ZIF
SPGA
Intel Pentium 4 1.3GHz
Intel Celeron 1.7GHz-1.8GHz
Socket 478 478
ZIF
µPGA
Intel Celeron
Intel Pentium 4
Intel Pentium
Socket T 775
LGA
Intel Celeron D 325J (Prescott)
Intel Pentium 4
Intel Pentium D
Intel Pentium Extreme
Socket
603 /604
603/604
ZIF
µPGA
Intel Xeon
PAC418 418
VLIF
Intel Itanium 733-800MHz (Merced)
PAC611 611
VLIF
Intel Itanium 2
Socket 754 754
ZIF
AMD Athlon 64
AMD Sempron 2600+-3300
Department of computer Engineering, SIES Graduate School of Technology 4
Advanced Microprocessor
Socket 940 940
ZIF
AMD Athlon 64 FX-51 - FX-53 (Sledgehammer)
AMD Opteron 140-150 (Sledgehammer)
Socket 939 939
ZIF
AMD Athlon 64
Bus Slots
The various bus slots on motherboard are
ISA (Industry standard Architecture)
PCI (Peripheral Component Interconnect)
AGP (Accelerated Graphics Port)
AMR (Audio Modem Riser)
It also contains external connections for your onboard sound card, USB ports, Serial and
Parallel ports, PS/2 ports for your keyboard and mouse as well as network and Firewire
connections.
RAM Slots
There are varieties of RAM modules that can be mounted on the motherboard
SIMM (Single Inline Memory Modules)
Supports EDO RAM
DIMM (Dual Inline Memory Module)
Supports 3D and DDR RAM
RIMM (Rambus Inline Memory Module)
Supports RD RAM
Cache Memory
Cache is an intermediate or buffer memory. The idea behind cache is that it should
function as a “near store” of fast RAM. A store which the CPU can always be supplied from.
In practice there are always at least two close stores. They are called Level 1, Level 2, and (if
applicable) Level 3 cache.
Level 1 cache is built into the actual processor core. It is a piece of RAM, typically 8, 16, 20,
32, 64 or 128 Kbytes, which operates at the same clock frequency as the rest of the CPU.
Thus you could say the L1 cache is part of the processor. L1 cache is normally divided into
two sections, one for data and one for instructions. For example, an Athlon processor may
Department of computer Engineering, SIES Graduate School of Technology 5
Advanced Microprocessor
have a 32 KB data cache and a 32 KB instruction cache. If the cache is common for both data
and instructions, it is called a unified cache.
The level 2 cache is normally much bigger (and unified), such as 256, 512 or 1024 KB. The
purpose of the L2 cache is to constantly read in slightly larger quantities of data from RAM,
so that these are available to the L1 cache. Now the L2 cache has been integrated within
processor and that makes it function much better in relation to the L1 cache and the processor
core.
The level 2 cache takes up a lot of the chip’s die, as millions of transistors are needed to make
a large cache. The integrated cache is made using SRAM (static RAM), as opposed to normal
RAM which is dynamic (DRAM).
Buses
Bus Description
PC-XT
from 1981
Synchronous 8-bit bus which followed the CPU clock frequency of
4.77 or 6 MHz
Band 170: 4-6 MB/sec.
ISA (PC-
AT)
from 1984
Simple, cheap I/O bus.
Synchronous with the CPU.
Band width: 8 MB/sec.
MCA
from 1987
Advanced I/O bus from IBM (patented). Asynchronous, 32-bit, at 10
MHz
Band width: 40 MB/sec.
EISA From
1988
Simple, high-speed I/O bus.
32-bit, synchronized with the CPU’s clock frequency: 33, 40, 50 MHz.
Band width: up to 160 MB/sec.
PCI
from 1993
Advanced, general, high-speed I/O bus. 32-bit, asynchronous, at 33
MHz
Band width: 133 MB/sec.
USB and
Fire wire,
from 1998
Serial buses for external equipment.
Department of computer Engineering, SIES Graduate School of Technology 6
Advanced Microprocessor
Bus Description
PCI
Express
from 2004
A serial bus for I/O cards with very high speed. Replaces PCI and
AGP.
500 MB/sec. per. Channel.
CONCLUSION:
.
Experiment No. 02:
Simulation of pipeline processing
AIM: Write a Program in Java to simulate a pipeline processing.
EQUIPMENT:
Internet, Books, PC.
THEORY:
Pipeline is a process of prefetching the next task while executing the current task. Pipeline in which task is divided in subtasks and in each stage of pipeline subtask is executed. Instruction pipeline in which instruction is prefetched while executing current instruction. In this simulation, High level language can be used to simulate the same.
Algorithm:
Start
Display of vertical lines
Display of Instruction stages in pipelines
Movement of instructions one by one
End
CONCLUSION:
Department of computer Engineering, SIES Graduate School of Technology 7
Advanced Microprocessor
Experiment No. 03:
Super Pipeline
AIM: Write a program to Simulate Superscalar and Super Pipeline .
EQUIPMENT:
Internet, Books
PC.
THEORY:
In Superscalar Architecture, Pipeline implementation implies parallelism and more than one
instruction are executed at a time. Two – issue superscalar pipeline means at a time two
instructions are pipelined and if it is three issue superscalar pipeline means at a time three
instructions are pipelined. This type of pipelining increase the throughput of the processor .
now days 8 issue superscalar structure is been developed. In superscalar processor fetches
multiple instructions at a time and attempts to find nearby instructions that are independent of
one another and can be executed in parallel .The essence of the superscalar approach is the
ability to execute instructions independently in different pipelines. In Super Pipeline, Many
pipeline stages need less than half a clock cycle. Double internal clock speed gets two tasks
per external clock cycle.
Department of computer Engineering, SIES Graduate School of Technology 8
Advanced Microprocessor
Algorithm
Start
Display of vertical lines
Display of Instruction stages in pipeline
Movement of two instructions at a time.(2-issue superscalar)
In super pipeline, each instruction is taking less than one cycle(completion of each
stage in half cycle)
End
Department of computer Engineering, SIES Graduate School of Technology 9
Advanced Microprocessor
CONCLUSION:
Department of computer Engineering, SIES Graduate School of Technology 10
Advanced Microprocessor
Experiment No. 04:
Data dependency hazards
AIM: Write a program to detect data dependency hazards.
EQUIPMENT:
Internet, Books
PC.
THEORY:
Dependency among the instructions is required to remove in order to implement instruction level parallelism (ILP). There are three types of data dependency exist which are to be identified and eliminated from sequential flow of instructions
True data dependency Hazard (Flaw dependency/RAW Hazard)
Eg :
R1:=R2+ R3
R4:= R1-R5
Antidependency (WAR Hazard)
Eg:
R1:= R2+R3
R2:= 6
Output dependency Hazard (WAW Hazard)
Eg:
R1:= R2+R3
R1:=R5
Algorithm
Start
Accept No of Instructions
Accept Source and destination for each instruction
For checking Flow dependency, compare destination of each instruction with Src
of other instructions sequentially.
Department of computer Engineering, SIES Graduate School of Technology 11
Advanced Microprocessor
For checking anti dependency, compare src of each instruction with destination of
other instructions sequentially.
For checking output dependency , compare destination of each instruction with
destination of others
Display the flow dependant, Anti dependent ,output dependent instructions ,
Display of Instruction stages in pipeline
Movement of two instructions at a time.(2-issue superscalar)
Three data dependency hazards are to be simulated
End
Output
CONCLUSION:
Department of computer Engineering, SIES Graduate School of Technology 12
Advanced Microprocessor
Experiment No. 05:
Simulation of Brach Prediction logic.
AIM: Write a program to Simulate Brach Prediction logic.
EQUIPMENT:
Internet, Books PC
Prediction Logic is used to minimize penalty incurred due to branch instructions. To reduce time taken by queue to flush and fetch again and again branch prediction is used.Following diagram depicts the need of Branch Prediction Logic
BTB(Branch Translation Buffer) is lookup table which has 256 entries (2^8=256, 2 way associative cache )
Valid bit Source Address History bits Target Address
History bits can be in the one of four states and based on which prediction is
00~ Strongly Taken
Department of computer Engineering, SIES Graduate School of Technology 13
Advanced Microprocessor
01~ Weakly taken
10~ Weakly not taken
11~ Strongly Not taken
Algorithm:
1. Find source address of instruction into look up table.
a. if (Source Addr not Found) // Instruction encountered first time
Prediction is NO JUMP
{
if ( branch ) insert record into BTB with history bits ‘00’
else do nothing.
}
b. If (Source Addr Found )
Prediction is JUMP / NO JUMP// Based on history bits
{
if ( branch ) History bits are upgraded
else History bits are degraded
}
}
Output :
Instructions in program are :
cmp x1,x2
Jump if x1 < x2
Enter x1 , x2 value : 35 45
Prediction is No JUMP
Branch taken Incorrect Prediction ……. History bits are strongly taken
Enter x1 , x2 value : 31 11
Department of computer Engineering, SIES Graduate School of Technology 14
Advanced Microprocessor
Prediction is JUMP
Branch not taken Incorrect Prediction ………History bits are weakly taken
Enter x1 , x2 value : 63 10
Prediction is NOJUMP
Branch not taken Correct Prediction………. History bits are weakly not taken
Enter x1, x2 value: 74 95
Prediction is NO JUMP
Branch taken Incorrect Prediction………. History bits are weakly taken
CONCLUSION:
Experiment No. 06:
Department of computer Engineering, SIES Graduate School of Technology 15
Advanced Microprocessor
Implementation of Page replacement algorithm
AIM: Write a Program to implement Page replacement algorithm.
EQUIPMENT:
Internet, Books PC
THEORY:
Whenever there is a page required for data it will be searched in the cache. If it is not present it will be brought in to the cache. If there is space in the cache the any page is replaced by the new page for this various techniques are used such as FIFO, LRU, optimal, clock etc.FIFO: in this technique the page entered first is replaced.
Eg:
LRU: in this technique the page least recently used is replaced.
Eg :
Lowest page-fault rate of all algorithms. Never suffer from Belady’s anomaly Replace page that will not be used for longest period of time.4 frames example
1,2,3,4,1,2,5,1,2,3,4,5How do you know this? Used for measuring how well your algorithm performs. Difficult to
Department of computer Engineering, SIES Graduate School of Technology 16
Advanced Microprocessor
implement as it requires prior knowledge of reference string (like SJF in CPU Scheduling)Mainly used for comparison studies
CONCLUSION:.
Experiment No. 07:
Department of computer Engineering, SIES Graduate School of Technology 17
Advanced Microprocessor
PENTIUM Processors.
AIM: Study of PENTIUM processors.
EQUIPMENT:
Internet
Books
THEORY:
FEATURES OF P5:
64 bit data bus, that permits 8-bytes or 4-words to be transferred in a single bus cycle.
8-bit data cache and instruction cache.
It has a two issue superscalar architecture.
Parallel integer execution consisting of U pipeline and V pipeline.
It has a branch target buffer and branch prediction logic.
It has an operating speed from 60MHz to 200MHz.
FEATURES OF P6:
It is a 32-bit Intel microprocessor.
Implements a dynamic execution micro-architecture having speculative and out of
order execution.
It has a 3 way superscalar architecture allowing execution of 3 instructions per clock
cycle.
It has two on-chip 8-kB L1 cache and 256-kB L2 cache.
It has a dynamic execution that is micro data flow analysis, out of order execution,
superior branch prediction and speculative execution.
FEATURES OF PENTIUM 4:
It has a processing speed of 1.4 GHz.
It has a 20 stage hyper pipelining technology.
It consists of a hyper threading technology.
It expands the floating-point registers to a full 128-bit and adds an additional register
for data movement which helps improve performance on both floating-point and
multimedia applications.
Department of computer Engineering, SIES Graduate School of Technology 18
Advanced Microprocessor
It has an built in self test to test if all the attributes of the processor are working
properly.
13 new instructions in SSE3 are primarily designed to import thread synchronization
and specific application areas such as media and gaming.
ARCHITECTURE OF PENTIUM 4:
Sr no
Features Pentium5 PentiumPro
PentiumII
Pentium 4
1 Year Introduced 1993 1995 1997 2000
Department of computer Engineering, SIES Graduate School of Technology 19
Advanced Microprocessor
2 Processor Size 32 bits 32 bits 32 bits 32 bits
3 Speed 60 to 66 MHz(later upto 200MHz)
60 MHz(later upto 200MHz)
166 MHz(later upto 300MHz)
400 MHz(later upto 2.26GHz)
4 MIPS 100 to 112 200 350 3000
5 Address Bus Size 32 32 32 32
6 Data Bus Size 64 64 64 64
7 No. of transistors 3.1 million 5.5 million 7.5 million 77 million
8 Addressable Memory 4 GB 64 GB 64 GB 64 GB
9 Virtual memory 64 TB 64 TB 64 TB 64 TB
10 L1 Cache 16KB Spilt 16KB Spilt 32KB Spilt 12 Kμ Cpcodes
+ 8KB data
11 L2 Cache Off chip not specified
256KB to 1MBUnified
512KB on chip 1MB ATC
12 MMX instruction set No No Yes Yes
13 Hyper threading support No No No Yes
14 Architecture Family P5 P6 P6 NetBurst
15 SMP(multiprocessor) support
No No No Yes
16 Integer pipeline stages 5 14 5 20
17 Floating point pipeline stages
8 - 8 20
18 Brief Description Superscalar Architecture
Intel’s first true server/workstation
chip
Dual independent bus, dynamic
execution,MMX technology
Data transfer rate is 4.2 GB
COMPARISON BETWEEN PENTIUM PROCESSORS:
CONCLUSION:
Department of computer Engineering, SIES Graduate School of Technology 20
Advanced Microprocessor
Experiment No. 08:
SPARC Architecture
AIM: Study of SPARC Architecture (V8).
Department of computer Engineering, SIES Graduate School of Technology 21
Advanced Microprocessor
EQUIPMENT:
Internet Books SPARC Manual
THEORY:
Scalable Processor ARChitecture, or SPARC ATTRIBUTES SPARC is a CPU instruction set architecture (ISA), derived from a reduced instruction set computer (RISC) lineage. As an architecture, SPARC allows for a spectrum of chip and system implementations at a variety of price/performance points for a range of applications, including scientific/engineering, programming, real-time, and commercial. DESIGN GOALS SPARC was designed as a target for optimizing compilers and easily pipelined hardware implementations. SPARC implementations provide exceptionally high execution rates and short time-to-market development schedules. REGISTER WINDOWS SPARC, Formulated At Sun Microsystems In 1985, Is Based On The Risc I & II designs engineered at the University of California at Berkeley from 1980 through 1982. The SPARC “register window” architecture, pioneered in UC Berkeley designs, allows for straightforward, high-performance compilers and a significant reduction in memory load/store instructions over other RISCs, particularly for large application programs. For languages such as C++, where object-oriented programming is dominant, register windows result in an even greater reduction in instructions executed. Note that supervisor software, not user programs, manages the register windows. A Supervisor can save a minimum number of registers (approximately 24) at the time of a context switch, thereby optimizing context switch latency. One difference between SPARC and the Berkeley RISC I & II is that SPARC provides greater flexibility to a compiler in its assignment of registers to program variables. SPARC is more flexible because register window management is not tied to procedure call and return (CALL and JMPL) instructions, as it is on the Berkeley machines. Instead, separate instructions (SAVE and RESTORE) provide register window management.
SPARC System
Components
The architecture allows for a spectrum of input/output (I/O), memory management unit (MMU), and cache system sub-architectures. SPARC assumes that these elements are optimally defined by the specific requirements of particular systems. Note that they are invisible to nearly all user application programs and the interfaces to them can be limited to localized modules in an associated operating system.
SPARC includes the following principal features:
A linear, 32-bit address space. Few and simple instruction formats — All instructions are 32 bits wide, and are
aligned on 32-bit boundaries in memory. There are only three basic instruction
Department of computer Engineering, SIES Graduate School of Technology 22
Advanced Microprocessor
formats, and they feature uniform placement of opcode and register address fields. Only load and store instructions access memory and I/O.
Few addressing modes — A memory address is given by either “register + register” or “register + immediate.” Triadic register addresses— Most instructions operate on two register operands (or
one register and a constant), and place the result in a third register. A large “windowed” register file — At any one instant, a program sees 8 global
integer registers plus a 24-register window into a larger register file. The windowed registers can be described as a cache of procedure arguments, local values, and return addresses.
A separate floating-point register file — configurable by software into 32 single-precision (32-bit), 16 double-precision (64-bit), 8 quad-precision registers (128-bit), or a mixture thereof.
Delayed control transfer— the processor always fetches the next instruction after a delayed control-transfer instruction. It either executes it or not, depending on the control-transfer instruction’s “annul” bit.
Fast trap handlers— Traps are vectored through a table, and cause allocation of a fresh register window in the register file.
Tagged instructions — The tagged add/subtract instructions assume that the two least-significant bits of the operands are tag bits.
Multiprocessor synchronization instructions — one instruction performs an atomic read-then-set-memory operation; another performs an atomic exchange-register-with-memory operation.
Coprocessor— the architecture defines a straightforward coprocessor instruction set, in addition to the floating-point instruction set. In SPARC Architecture, Following concepts are also described The Instruction Set, Addressing Modes, Pipeline Processing,, FPU , Interrupts , Bus cycles, Programming Model. Etc.
CONCLUSION:
Department of computer Engineering, SIES Graduate School of Technology 23