Upload
lephuc
View
229
Download
4
Embed Size (px)
Citation preview
Power Struggles: Revisiting the RISC vs.CISC Debate on Contemporary ARM and x86Architectures
by Ernily Blern, laikrishnan Menon, and KarthikeyanSankaralingarn
Danilo Dominguez [email protected]
December 8, 2016
Iowa State University
1
Overview
• Presentation of the Paper• Background• Infrastructure• Methodology• Results• Conclusion
• Analysis of the Paper
2
Motivation
• Today’s computing landscape is starting to be shaped bysmartphones and tablets
• Previously, RISC architectures governed over smalldevices whereas CISC architectures ruled over serversand desktops
• Nowadays, both architectures are ”trespassing” each otherareas
3
Previous Work
• Previous studies compare performance of RISC vs. CISCISAs finding simple RISC instructions had betterperformance (Bhandarkar & Clark, 1991)
• Later studies conclude both ISAs have similar performanceif they apply aggressive ILP techniques (Isen, John, &John, 2009)
• Informal studies have suggest the power overheads ofCISC ISAs are intractable
4
This Study
• Revisit the RISC vs. CISC debate from a power andenergy perspective
• Previous comparisons focused on performance
• Show what is the role of the ISA and make a case forrevisiting the RISC vs. CISC question
5
RISC vs. CISC ISA
RISC CISC
Format Fixed length instructions Variable length instructions
Operations Single cycle operations Multi-cycle operations
Operands Few addressing modes Many addressing modes
* CISC instructions split into RISC-like micro-ops andmodern compilers pick mostly RISC-like instructions
6
Infrastructure
• Find implementations with similar microarchitecture andI/O and memory subsystems
• Use workloads for servers, desktops and mobile
• Linux 2.6 LTS kernel
• gcc 4.4 target independent with optimizations enabled (O3)
• perf tool used for performance measurements
• Watssup meter connected directed to the board for powermeasurements
• For instruction mix use DynamoRIO for x86 ISA and gem5for ARM ISA
7
Limitations
• Core: No platform uniformity across ISAs
• Tools: use gcc uniform optimizations. Specific architectureoptimizations less than 10% effect
11
Methodology: Performance Analysis
Mea-sure
Use perf tool to measureexecution time on benchmarks
Nor-malize
Normalize frequency’s im-pact using cycle count
CycleCounts
Present dynamic instructioncount measures to understand
differences in cycle count
Not ISAUse microarchitectural eventsto find performance expenses
sources different to ISA
12
Methodology: Power and Energy Analysis
Mea-sure
Present raw powermeasurements
Nor-malize
Factor out technology impactby scaling to 45nm and nor-malizing frequency to 1 GHz
Perfor-mance
Use raw energy to findinterplay between per-formance and power
ISAUse qualitative analysis tofind ISA power’s influence
13
Results: Cycle Count
• Performance gaps of up to1.5x from A8 to Atom
• Performance gaps of up to2.5x from A9 to i7
Cycle Count Normalized to i7
14
Results: Instruction Count
• Instruction count similaracross ISAs
• gcc tends to pick similarRISC-like instructions
• CPI was less in x86architectures: geometricmean of 3.4 for A8, 2.2 forA9, 2.1 for Atom and 0.7for i7
Instruction Count Normalized toi7
15
Results: Instruction Format and Mix
• Similar code densities inboth architectures
• Fraction of loads andstores similar across ISAfor all suites
• Large instruction counts forARM are due to absenceof FP instructions likefsincon, fy12xpl
• ISA effects areindistinguishable betweenx86 and ARMimplementations
Instruction Mix
SelectedInstruction Counts
16
Results: Microarchitecture
• Microarchitecture has the most impact on performance
• The large cache, accurate branch prediction and biggerissue width on OoO processors (A9 and i7) help x86 tohave better performance
• ISA has not considerable effect on performance
17
Results: Power and Energy
• With scaled frequency andtechnology, ISA isirrelevant
• i7 processors are designedto have high performancebut they are notpower-optimized
• Atom, on the other hand,have power consumptionon par with ARM A8 andA9
Tech. Independent Avg. PowerNormalized to A8
RawAverage Energy Normalized to
A8
18
Conclusions
• x86 CPI < ARM CPI
• ISA performance effects indistinguishable between x86and ARM
• Beyond micro-op translation, x86 ISA introduces nooverheads over ARM ISA
• Choice of performance optimizations impacts powerconsumption more than ISA
• ISA’s impact on energy is insignificant
19
Analysis of the Paper
• The paper was introduced as an analysis on the effect ofISAs on power and energy but end up presenting moredata on performance
• A more interesting study would be to find whatoptimizations for performance make i7 consume morepower
• Also, what factors make Atom to have performance on parof ARM A9 and be power-friendly
20
Bibliography i
References
Bhandarkar, D., & Clark, D. W. (1991). Performance fromarchitecture: comparing a risc and a cisc with similarhardware organization. In Acm sigarch computerarchitecture news (Vol. 19, pp. 310–319).
Isen, C., John, L. K., & John, E. (2009). A tale of twoprocessors: Revisiting the risc-cisc debate. In Specbenchmark workshop (pp. 57–76).
21