Upload
collin-bryant
View
212
Download
0
Embed Size (px)
Citation preview
IntroductionIntroduction
Computer ArchitectureComputer Architecture
Modern Computers
• Advancing at a rapid pace– Integral part of daily life
• Almost every aspect of human civilization now depends on it!
– Started to perform mathematical calculations• Reduce human errors• Increased speed
– Evolved into a complex, versatile machine• Store, retrieve, & process voluminous data
– At very high speeds
• Intuitive human-machine interface– Used for communication & entertainment as well
Managing Complexity• Increased complexity poses many challenges
– Exacerbates design, manufacturing, usage, and maintenance
• Solution: Multi-facetted Abstraction– Abstraction of data
• Store & process digital rather than analog data– Better control on the electronics/physics of the machine
– Abstraction of layers• Multi-tiered design • Each tier provides a more simplified view of underlying physics
– Abstraction of components• System built as a collection of sub-systems with well defined
behavior and functionality• Special interconnections between sub-systems
– Abstraction of control• In the form of software
Transistors
IC Chips
Cards & Devices
PC
A peek under the hood of a PC
• Major components of a PC– Main board aka Motherboard
• They are all Printed Circuit Board (PCB)– PCB is a multi-layered, rigid plastic sheet with copper wires etched
directly on it and permit ICs and electronic components to be mounted and soldered.
– Microprocessor/CPU• The brain and heart of the computer. It processes instructions to
manipulate data
– RAM or Main memory• RAM: Random Access Memory• High speed read/write memory for storing instructions and data
– Buses: Collection of wires that interconnect components• FSB: Front-Side-Bus interconnects CPU to main memory
Anatomy of a PC
CPU
North Bridge
RAM Video Card
South Bridge AudioHard Disk Drive (HDD)
ConnectivityUSB
Keyboard & MouseSuper I/O
Network PCI Slots
Parallel & Serial Ports
Floppy
Common Units
• It is important to know and understand the common units used for– Time (base unit: second)
• Derived units:– Millisecond (msec): 10-3 seconds– Microsecond (usec): 10-6 seconds– Nanosecond (nsec): 10-9 seconds– Picoseconds (psec): 10-12 seconds
– Frequency (base unit: Hertz or Hz)• Note that time and frequency are inversely related• Derived units:
– Kilohertz (KHz): 103 Hz– Megahertz (MHz): 106 Hz– Gigahertz (GHz): 109 Hz
Common Units (Contd.)• It is important to know and understand the common
units used for– Memory size (base unit: byte)
• Derived units:– Kilobyte (KB): 103 bytes– Megabyte (MB): 106 bytes– Gigabyte (GB): 109 bytes– Terabyte (GB): 1012 bytes– Petabyte (GB): 1015 bytes
• Do not confuse above units with KiB/MiB/GiB– Units with a “i” in the middle are powers of 1024 rather than 1000.– For example 1 MiB = 10242 bytes.
– FLOPS (base unit: FLOPS)• Derived units:
– Megaflop (MFLOP): 106 FLOPS– Gigaflop (MFLOP): 109 FLOPS– Teraflop (MFLOP): 1012 FLOPS– Petaflop (MFLOP): 1015 FLOPS
RAM
• Random Access Memory (RAM)– Volatile memory used to store data
• Data is lost when power is lost
– Most frequently accessed by the CPU– A PC will not boot unless there is some RAM on the main
board
• Therefore it must operate fast.• Typically 20-70 nanoseconds (10-9 seconds)
– Is available in a variety of sizes, technologies, and speeds
• Latest and greatest is DDR3
North & South Bridge
• The North & South Bridges are special Integrated circuits (ICs) (aka chips) that interconnect various components on a computer– These are often called chipsets
• Some computers (namely those from AMD) have the functionality of the Northbridge already fabricated on the CPU itself and don’t need a special chip for it.
– They are specific to the microprocessor and type of devices present on a computer
CPU
• The Central Processing Unit (CPU) aka Microprocessor (p)– The brain of the computer– Its task is to execute instructions
• It keeps on executing instructions from the moment it is powered-on to the moment it is powered-off.
• Execution of various instructions causes the computer to perform various tasks.
– There are a wide range of CPUs • We will study CPUs in detail in this course• The dominant CPUs on the market today are from
– Intel: Pentium (Desktop), Xeon (Server), Pentium-M (Mobile), X-scale (Embedded)
– AMD: Athlon (Desktop), Opteron (Server), Turion (Mobile), Geode (Embedded)
Key Characteristics of a CPU• Several metrics are used to describe the characteristics of a CPU
– Native “Word” size• 32-bit or 64-bit
– Clock speeds• The clock speed of a CPU defines the rate at which the internal clock of the
CPU operates– The clock sequences and determines the speed at which instructions are executed by
the CPU• Modern CPUs operate in Gigahertz (1 GHz = 109 Hz)• However, clock speeds alone do not determine the overall ability of a CPU
– Instruction set• More versatile CPUs support a richer and more efficient instruction sets• Modern CPUs have new instruction sets such as SSE, SSE2, 3DNow that
can be used to boost performance of many common applications– These instructions perform the operation of several instructions but take less time
– FLOPS: Floating Point Operations per Second• FLOPS are a better metric for measuring a CPU’s computational capabilities• Modern CPUs deliver 2-3 Giga FLOPS (1 GFLOPS = 109 FLOPS)• FLOPS is also used as a metric for describing computational capabilities of
computer systems– Modern supercomputers deliver 1 Terra Flop (TFLOP), 1 TFLOP = 1012 FLOPS.
Key Characteristics of a CPU (Contd.)• Several metrics are used to describe the characteristics of a CPU
– Power consumption• Power is an important factor in today’s computing• Power is the product of voltage applied to the CPU (V) and the amount of
current drawn by the CPU (I)» The unit for voltage is volts» The unit for current is ampere (aka amps)
– Power = V x I – Power is typically represented in Watts– Modern desktop processors consume anywhere from 35 to 100 watts
• Lower power is better as power is proportional to heat– More power implies the processor generates more heat and heating is a big problem
– Number of computational units of core per CPU• Some CPUs have multiple cores • Each core is an independent computational unit and can execute instructions
in parallel• More the number of cores the better the CPU is for multi-threaded
applications– Single threaded applications typically experience a slow down on multi-core processors
due to reduced clock speeds
Key Characteristics of a CPU (Contd.)• Several metrics are used to describe the characteristics of a CPU
– Cache size and configuration• Cache is a small, but high speed memory that is fabricated along with the
CPU– Size of cache is inversely proportional to its speed– Cost of the CPU increases as size of cache increases
• It is much faster than RAM• It is used to minimize the overall latency of accessing RAM• Microprocessors have a hierarchy of caches
– L1 cache: Fastest and closest to the core components– L3 cache: Relatively slower and further away from CPU
– Example cache configurations (See comparative die images):• Quad core AMD Opteron (Shanghai)• 32 KB (Data) + 32 KB (Instr.) L1 cache per core• Unified 512 KB L2 cache per core• Unified 6 MB shared L3 cache (for 4 cores)
– Quad core Intel Xeon (Nehalem)• 32 KB (Data) + 32 KB (Instr.) L1 Cache per core• Unified 256 KB L2 cache per core• Unified 8 MB shared L3 cache (for 4 cores)
Trends in computing
• Until recently, hardware performance improvements have been primarily achieved due to advancement in microprocessor fabrication technologies:– Steady improvement in processor clock speeds
• Faster clocks (with in the same family of processors) provide higher FLOPS
– Increase in number of transistors on-chip• More complex and sophisticated hardware to improve
performance• Larger caches to provide rapid access to instruction
and data
Moore’s Law• The steady advancement in microprocessor technology
was predicted by Gordon Moore (in 1965), co-founder of Intel– Moore’s law states that the number of transistors on
microprocessors will double approximately every two years.• Many advancements in digital technologies can be linked to Moore’s
law. This includes:– Processing speed– Memory Capacity– Speed and bandwidth of data communication networks and – Resolution of monitors and digital cameras
– Thus far, Moore’s law has steadily held true for about 40 years (from 1965 to about 2005)
– Breakthroughs in miniaturization of transistors has been the turnkey technology
• See comparative technical video (Courtesy Intel)• See trends from a non-technical perspective (Courtesy Intel)
Moore’s Law vs. Intel’s roadmap• Here is a graph illustrating the progress of Moore’s law based on Intel
Inc. technological roadmap (obtained from Wikipedia)
Stagnation of Moore’s Law• In the past few years we have reached the
fundamental limits at IC fabrication technology (particularly lithography and interconnect)– It is no longer feasible to further miniaturize the
transistors on the IC• They are already just a several atoms large and at this point
laws of physics change making it an extremely challenging task
– Heat dissipation has reached breakdown threshold• Heat is generated as a part of regular transistor operations• Higher heat dissipations will cause the transistors to fail
• With the current state of the art a single processor cannot yield any more than 4 to 5 GFLOPS– How do we move beyond this barrier?
Multi-core and Multi-processors
• The solution to increasing the effective compute power is via the use of multi-core, multi-processor computer system along with suitable software– This is a paradigm shift in hardware and software
technologies– Multiple cores and multiple processors are
interconnected using a variety of high speed data communication networks
– Software plays a central role in harnessing the power of multi-core/multi-processor systems
• Most industry leaders believe this is the near future of computing!
Multi-core Trends• Multi-core processors are most definitely the future of computing
– Both Intel and AMD are pushing for larger number of cores per CPU package
– The Cell Broadband Engine (aka Cell) has 8 synergistic processing elements (SPE)
– The Sun Microsystems Niagara has 8 cores, with each core capable of running 8-threads
• Here is a short video from Intel demonstrating their proof-of-concept, next generation Tera-chip designed to deliver a teraflop of compute power from a single CPU package.
Manufacturing ICs
• Manufacturing of Integrated Circuits (ICs) in a complex process– Almost all the phases are completely automated
• Use Computer Aided Manufacturing (CAM)• Manufacturing plants where the ICs are fabricated are
called “Fabs”
– Everything is computerized and software driven• Extensive Computer Aided Design (CAD)• Testing of designs is performed using simulations
CPU Manufacturing Process
Computer Aided Design (CAD)
Simulation-based Testing
Mask Design for Photolithography
Silicon Manufacturing
Silicon Wafer
Multiple phases of Photolithography
Wafer with multiple CPUs
Testing & Dicer
Packaging
Tested Dies
Testing & Shipping to OEMs & Retail
Computer Architecture
• Science to enable effective engineering of digital computers. – This is a complex field that spans various levels
of hierarchy• Component level• Device level• Device interconnection• CPU design
– This is where most of the R&D work lies
• Development of firmware and BIOS
Why do we need Comp. Arch.?
• Effective use of modern computers– Optimal software development
• Crucial for embedded computing• Important for system programming
– Development of operating systems
– Design of device drivers and custom software
• Tap into high performance features• Avoid hidden bottlenecks
– Know hidden pitfalls
– Prudent economic investment• Know what you need & buy what you need
– Don’t waste your money
• Know what you are buying
– Design & development of hardware
Goals of this class
• To understand:– Design of a PC– Working and functionality of various components
and subsystems in a PC– Develop initial set of skills to program a computer
at a low level via assembly• It is an important part in developing a hardware-
software interface using a hierarchy of language translators
Hardware-Software Interface
• The interface between hardware and software is achieved using a hierarchy of translators – Hierarchy helps to
balance:• Portability &
Interoperability• Development overhead
vs. performance• Design & manufacturing
Typical Hierarchy:
Program/software in a High-level Language (C/C++)
Compiler
Translated code in Assembly
Assembler
Machine Language
if (a > b) { c = a;} else { c= b;}
cmp a,bjg elsePartmov a,cjmp endifelsePart: mov b,cendif:
000101111010101000100101010101000111010101011010101010101010101
Semantic Gap• Semantic gap is a term that is used to describe
– The disconnect between a high level programming language (such as: Java, C++, or C) and the underlying hardware architecture
– It is used to refer to the distinction between the conceptual view (from a programmer’s perspective) versus the actual operations of a CPU
• Small semantic gap is desirable– High level programming languages have large semantic gaps
• Prevent effective use of underlying hardware– Assembly has small semantic gap
• Higher performance comes at the price of higher software development overheads
• Hierarchy of translators essentially attempt to bridge the semantic gap– Bridging the gap often requires significant human intervention
Plan of Action
• Study in a bottom-up fashion– Digital logic / Boolean algebra
• Logic gates• Logic circuits (Interconnection of gates)
– Number representation using digital logic• Memory circuits• Arithmetic circuits
– Arithmetic & Logic circuits – Processor (or CPU)• Programming the CPU in Assembly language
– We will spend good deal of time here
• Learn about performance enhancement strategies– Most computer architecture work lies in this area