Upload
lela
View
24
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Lecture 1: Introduction. Performance Evaluation of Computer Systems. Computer systems consist of: Processor Memory Input/Output Operating system Network. Memory. instruction. data. Input unit. Output unit. Processor. P. P. P. P. Network. - PowerPoint PPT Presentation
Citation preview
Lecture 1:Lecture 1:
Introduction Introduction
Course Outline
The aim of this course:
Introduction to the methods and techniques of performance analysis of computer systems.
Solve computer performance analysis problems related to • measuring performance of computer systems, • comparison of computer systems• predicting the future performance under different configurations, • designing new applications that meet performance requirements• planning the capacity
Hands-on experiments on modern hardware/software systems
Course Outline1. Introduction
2. Hardware and software aspects of computer systems
3. Performance metrics
4. Performance measurement tools and techniques
5. Benchmarking
6. Statistical analysis of performance experiments
7. Design of experiments
8. Processor Performance • ALU• Pipelining• Optimizing program performance
9. Memory Hierarchy• Cache performance• Optimizing program performance
10. Performance of multiprocessor systems
11. Simulation
12. Queueing Theory
Course Outline
Textbook: D. Lilja, “Measuring Computer Performance: A Practitioner's Guide”,
Cambridge University Press
Reference Books: R. Jain, “The Art of Computer Systems Performance Analysis”, John Wiley P.J. Fortier, H.E. Michel, “Computer Systems Performance Evaluation and
Prediction”, Digital Press K.R. Wadleigh, I.L. Crawford, “Software Optimization for High Performance
Computing”, Prentice-Hall Computer Systems: A Programmer’s Perspective, R.E. Bryant, D.R.O’Hallaron,
Pearson Computer Architecture, J.L. Hennessy, D.A. Patterson, Morgan & Kaufmann High Performance Computing, K.R. Wadleigh, I.L. Crawford, Prentice Hall
Course Outline
Grading:
Assignments 30% Midterm 30% Final Exam 40%
Performance Evaluation of Computer Systems
Computer systems consist of:• Processor
• Memory
• Input/Output
• Operating system
• Network
instruction data
Memory
ProcessorInput unit
Output unit
P P P P
Network
Performance Evaluation of Computer Systems
Performance depends on:• Technology
Technology
In recent years, microprocessors have become smaller and denser.
1945 2010
Computer ENIAC Laptop
Devices 18 000 17 000 000 000
Weight (kg) 27 200 2.8
Size (m3) 68 0.0018
Power (watts) 20 000 5.5
Cost ($) 4 630 000 1 000
Memory (bytes) 200 2 147 483 648
Performance (Flops/s)
800 2 000 000 000
Moore’s Law
Gordon Moore predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.
Moore’s Law
• Number of transistors
• Performance
Double every 1.5 year.
Top500 List at June 2013Computer Country Vendor Processor +
GPU + interconnect
# cores Rmax (Pflops)
Rpeak (Pflops)
1 Tianhe-2 China NUDT Xeon 2.2GHz+ Nvidia GPU +
custom
3 120 000 33.9 54.9
2 Titan USA Cray Opteron 2.2GHz+ Nvidia GPU + CRAY Gemini
560 640 17.6 27.1
3 Sequoia USA IBM BlueGene 1.6GHz+ custom
1 572 864 17.2 20.1
4 K computer Japan Fujitsu Sparc64 2.0GHz+ Tofu
705 024 10.5 11.3
5 Mira USA IBM BlueGene 1.6GHz+ custom
785 432 8.6 10.1
Performance Units
Speed1 Mflop/s 1 Megaflop/s 106 Flop/second1 Gflop/s 1 Gigaflop/s 109 Flop/second 1 Tflop/s 1 Teraflop/s 1012 Flop/second 1 Pflop/s 1 Petaflop/s 1015 Flop/second 1 Eflop/s 1 Exaflop/s 1018 Flop/second
Storage1 MB 1 Megabyte 106 Bytes1 GB 1 Gigabyte 109 Bytes 1 TB 1 Terabyte 1012 Bytes 1 PB 1 Petabyte 1015 Bytes
Moore’s Law
Limits of Moore’s Law:
Moore’s Law is exponential. Exponentials can not last forever.
Heat is a problem in today’s CPUs
The size of atoms is the fundamental barrier
Moore’s Law Reinterpreted
Number of cores per chip doubles every 2 years
• Multicore architectures
Moore’s Law Reinterpreted
Number of cores per chip doubles every 2 years, while clock speed decreases
• Multicore architectures
Performance Evaluation of Computer Systems
Performance depends on:• Technology
• Instruction Set Architecture
Instruction Set Architecture-ISA
Instruction Set Design:
RISC / CISC• Code density
Number of operands• Stack machines (0-operand)
• Accumulator machines (1-operand)
• Register machines (2-operand, 3-operand)
Performance Evaluation of Computer Systems
Performance depends on:• Technology
• Instruction Set Architecture
• Organization
Organization
Memory Hierarchy
Hierarchy Speed Size
Within the processor (CPU-registers-on chip cache)
1 ns Byte
L2 cache (SRAM) 10 ns KByte
Main Memory (DRAM) 100 ns MByte
Secondary storage (Disk) 10 ms Gbyte
Tertiary Storage (Tape/Disk) 10 s TByte
CPU
Registers
L1 Cache
L2 Cache
Main Memory
Disk
Tape
Organization
Manycore ChipsSingle-core Dual-core
CPU
Registers
L1 Cache
L2 Cache
Main Memory
CPU
Registers
L1 Cache
L2 Cache
Main Memory
CPU
Registers
L1 Cache
Performance Evaluation of Computer Systems
Performance depends on:• Technology
• Instruction Set Architecture
• Organization
• Software
Software
The primary duty of software developers is to create functionally correct programs
Performance evaluation is a part of software development for well-performing programs
Performance Analysis Cycle
Have an optimization phase just like testing and debugging phase
Code Development
Measure
Modify / Tune
Analyze
Usage
Functionally complete and correct program
Complete, correct and well-performing program
Systematic Approach to Performance Evaluation
1. Define the system
2. List services offered by the system
3. Select performance metrics
4. List system and workload parameters
5. Select factors and their values
6. Select evaluation technique
7. Select the workload
8. Design the experiment
9. Analyze the data
10.Present the results
1. Define the system
Client ServerNetwork
An Example:
2. List services offered by the system
Service:
Remote procedure call
3. Select performance metrics
Metrics:
• Time taken for the service • Elapsed time • Local CPU time• Remote CPU time
• The rate at which the service can be performed• calls per second
4. List system and workload parameters
• System Parameters • Speed of the network • Speed of the Local CPU• Speed of the Remote CPU• Operating system overhead
• Workload Parameters• Time between successive calls• Number and sizes of the call parameters
5. Select factors and their values
Factors are the parameters to be varied and their values are called levels.
For example:• Factor: speed of the network;
2 levels: short distance (in the campus), long distance (across the country)
• Factor: Sizes of the call parameters; 2 levels: small, large
• Factor: number of consecutive calls; 11 levels: 1,2,4,8, … 1024
6. Select evaluation technique
Three techniques:
• Analytical modeling• Simulation• Measuring the real system
7. Select the workload
Depending on the evaluation technique, the workload may be expressed in different forms.
• Analytical modeling• probability of various requests
• Simulation• a trace of requests measured on a real system
• Measurement• user programs
8. Design the experiment
• In the example: 2x2x11=44 experiments
• Phase 1• Number of factors is large but number of levels is small
• Phase 2• Reduce the number of factors and increase the
number of levels
9. Analyze the data
• Analysis of Variance• Regression• etc.
10. Present the results
• Use graphical form to represent the data rather than statistical results