Upload
denis-banks
View
216
Download
3
Embed Size (px)
Citation preview
Final Class, ECE472
• Midterm #2 due today– 1-5% extra credit for written report of Dally’s video
• Oral presentation of class project: today
• Graduate students: SDK example of running CUDA
• Undergraduate: extra credit (up to 1 whole class grade) for attempting CUDA on something
• Five points on FINAL-PROJECT for turning in the class evaluation
• Synthesized CPU: all of you will ‘see’ your final design in *real* silicon– 1 Million gates
What you learned in 10 weeks
• What is computer architecture– What technology/system constraints affect future
computer design• Software; memory; hazards; computation; locality• Tradeoffs in instructions sets (MIPS; ARM; Intel)
• How to estimate performance– Execution time– Power consumption (static / dynamic)
• How to improve performance– Increase clock frequency– Increase pipelining– Increase parallelism
• How to improve power– Go parallel– Low VDD operation– Power/clock gating
• Caching– Idea of memory hierarchy– Memories designed for capacity; NOT SPEED
• Multi-core / parallelism– Parallelism is the way industry is moving– ILP (instruction-level parallelism) is limited– Hardware parallelism by itself is limited– Explicit exposure to software to take advantage• Hardware can’t know what to parallelize• Hardware spends a lot of energy to parallelize
Practical
• Verilog – how to use verilog to simulate logic design– Intel, Mentor, Synopsys, Tektronix, nVidia, FPGA
synthesis (Xilinx; Altera)
• Test / Verification is DIFFICULT– 80% of the design costs for processors
• Parallel Programming – how to take sequential code and convert it into parallel– CUDA is example of this
Final Words• No Free Lunch– EE: making multi-cores is ‘simple’; making them useful is
different– CS: software of the future will NOT get faster, unless it is
parallelized
• Both performance AND power are important
• Future embedded/high-end systems requires understanding of this EE/CS co-design– i.e. multi-core/GPUs within cellphones– Cloud computing (1000’s of nodes in server farm)