View
1.086
Download
1
Category
Preview:
DESCRIPTION
Keynote at the MP Workshop Bristol 5sep11
Citation preview
Abstract ... Why does it take millions of transistors to realise the broadcast radio receiver
done in five in the 70's? Why are there millions of lines of software in products that are not programmable? And why do we throw things away before they break? Theare not programmable? And why do we throw things away before they break? The last 40 years saw the exponential growth of silicon capacity and the technology lead markets that ensued. Complexity was never an obstacle whilst design cost was 2nd order, but now that limits are upon us it is also apparent that people buy , p pp p p yproducts not technology ... so we strive to deliver elevated expectation despite Diminishing Returns.
My intent is to give context to subsequent Multicore discussions in and beyond this event. I will do this by looking at 'Efficiency' in a context of increasingly Diminishing Returns This will lead quite naturally to Multicore (CMP); its rationaleDiminishing Returns. This will lead quite naturally to Multicore (CMP); its rationale and its role. But will also raise questions about the way(s) forward ... as technology moves out of the market driving seat.
1
1v0
Progress:Prof. Ian PhillipsPrincipal Staff Eng’r,
ARM Ltdian.phillips@arm.com
Progress:Despite the Law of
Visiting Prof. at ...
pDiminishing Returns
Contribution to Industry Award 2008
ORWhatever happened to the 6 transistor radio?
Award 2008
Multi-Processor Workshop5sep11
2
Things Haven’t Always Things Haven’t Always Been Like This ...!Been Like This ...!
3
A Moment of Retrospection I learned all I know in the past!
... I just have more of it than most people!
I’ll take you back 36yrs to 1975 ... New: Degree, Home, Job, Wife, Baby, Car, Phone ...
Mk1 H man bab C1975 M k’N’ M d H t C1975
M l th t d !Mk1: Ian Phillips
C1975
Mk1: Human baby C1975Aka Dylan-Paul Phillips
Mark’N’: Mud-Hut C1975Aka: 5 Manor Close.
4
... More-or-less the same today!C1975
1975: Technology Products
Vauxhall Viva HB SL90
GPO Type 706 Telephone
5
... Recognisably the same, but very different today.
1975: Semiconductor Electronics Domestically we had...
Portable Radio Pocket Calculator Pocket Calculator Hi-Fi (Partial) Colour TV (Partial)
... That’s about all!
Professionally we had... TI SR 51 Calculator 1975Professionally we had...
Computers & First PCs Radio Receivers Satellites
BeoVision 3500 c1975 c1975
Satellites Under-Sea Cables Transmitters (Partial)
TV C (P ti l) TV Cameras (Partial) Telephony (Partial) Stuart 5 Transistor Radio
1975
IBM 220PX c1975
6
IBM 220PX c1975
2010: Electronics Everywhere but Nowhere
7
Confusing Technology Confusing Technology with Products ...with Products ...PP
8
The Computer ... Or Is It?
9
Computer: A Machine for Computing ... Computing ...
... A general term for algebraic (mathematical) manipulation of data ...
F( t )NumeratedPhenomena
Processed Data/Informationy=F(x,t,s)Phenomena
IN (x)Information
OUT (y)
... State and Time are factors in this. It can include phenomena ranging from human thinking to calculations
with a narrower meaning. Wikipedia
Usually used it to exercise analogies (models) of real-world situations; Frequently in real-time.q y
... No mention of Implementation Technology in this!
10
Planet Motion Computer – Orrery c1700MechanicalTechnology
• Inventor: George Graham (1674-1751)
11
• Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!)
Babbage's Difference Engine 1837
Th diff i i t f b f l b d f 1 t N E h l i bl t t d i l b Th l ti
(Re)construction c2000
MechanicalTechnology
The difference engine consists of a number of columns, numbered from 1 to N. Each column is able to store one decimal number. The only operation the engine can do is add the value of a column n + 1 to column n to produce the new value of n. Column N can only store a constant, column 1 displays (and possibly prints) the value of the calculation on the current iteration.
c 000
Computer for Calculating Tables: A Basic ALU Engine
12
Computer for Calculating Tables: A Basic ALU Engine
Enigma ~1940MechanicalTechnology
Data Encryption/Decryption Computer
13
Data Encryption/Decryption Computer
Colossus Computer 1944Valve/Mechanical
Technology
Code Breaking Computer: A Data Processor
14
Code-Breaking Computer: A Data Processor
Digital Computer – Baby 1947 (Reconstruction)
Valve/SoftwareTechnology
15
General Purpose, Quantised Time and Data, (Digital) Electronic Computing
Analogue Computer – AKAT c1960Transistor Technology
16
General Purpose, Continuous Time, Approximate (Analogue) Electronic Computing
Evolution of Radio
Tele-Verta Radio
BTH
Tele Verta Radio4 Valves
1 Rectifier Valve
c1945
Evoke DAB Radio100 M Transistors
2-3 Embedded Processors
Crystal Set1 Diode
c1925
Bush Radio7 Transistors
1 Diode
2-3 Embedded Processors
c2005
c1960
Ian’s ‘Span’
17
Radio as Computation ...Valve
TechnologyTransistor Technology
Integrated CircuitTechnology
Vrf=Vi*100
Vi
Vrf
Vif
Vro='Bandpass'(Vif*1000)
VroVif=Vrf*Vlo
Vlo
Vlo=Cos(t*1^6)
18
Single-Task, Continuous Time, Approximate (Analogue) Electronic Computing
Products Make Business 21c Businesses have to be Operations and Competition is Global and so are Investors Nationality has little meaningNationality has little meaning
Business needs End-Customers buy Products, not Technology
Technologies enable Product Options Business-Models make Money
New Products are Design is a Cost/Risk to be Minimised New Technology increases Cost/Risk ...
... But does not always increase Value HW, SW, Mechanics, Optics, etc are (just) means to an end!
... New Technology ≠ Market Success (Any More)
19
High Performance Computing (HPC) The exponential progression of
Moores Law has enabled thefantastic computation power we now take for granted ...
A F T f h dli bbi A Few Tens of headline-grabbing
A Few Hundred Million visibleA Few Hundred Million visible
... BUT...... BUT...
Tens of Billions of invisibleubiquitous
... Gives impression that General Purpose Digital Computation is what it is all about.
20
PPProducts are Solutions; Products are Solutions; Not The OtherNot The Other--Way Way Round ...Round ...
21
Embedded Computing Is ... Entertainment Remote Control Security Televisions
ID C d ID Cards Memory Logistics Logistics Transport Bankingg Manufacturing Energy Communications Medical
t t t
22
... etc, etc, etc.
High Performance (Embedded) ComputingObvious:
Business ModelA th ti
Less-Obvious:
Infrastructure ... Manufacture and Distribution
Aesthetics Performance Brand & Image
Fi h
Manufacture and Distribution Road network, Fuel supply, Tyres Service network & Training Sales and Marketing, etc
Finance schemes Dealership Warranty, etc
Technologies Internal Combustion Engine ... Bearings, Casting, Metal forming,
Paint Aerodynamics Glass RubberPaint, Aerodynamics, Glass, Rubber, Suspension ...
Manufacturing, Reliability, Quality ... Electronic Systems
23
The Evolution of Customer(kind) Universe – 13.6ByrEarth – 4.5Byr
(Us!) – appeared 35,000 yr ago ‘Developed’ from Homo-Sapien (Wise Human) 100,000 yr ago Primary Objective: Survive Nature (1,000 generations)Primary Objective: Survive Nature (1,000 generations)
- appeared ~2,000 yr ago Pythagoras Socrates Plato Aristotle Archimedes Pythagoras, Socrates, Plato, Aristotle, Archimedes, ... Objective: Understand Nature (100 gen.)
- appeared ~1,000 yrs ago
Pythagoras
Galileo, Descartes, (1000 ad) Electricity - William Gilbert (1600ad) Objective: Manipulate Nature (50 gen.)
Galileo
- just 260 yrs ago Industrial Revolution (1750: 8 gen’n) Year 0: Science Meets Exploitation Year 0: Science Meets Exploitation Objective: Exploit Nature (10 gen.) Brunell
24
... Remember: Real (Cro-Magnon) Customers Don't Buy Technology!
The iConic Must-Have Product ...
25
... Cool Design(California) & Manufacture(China) !
26
... Most Design Never Noticed or Valued
27
The Threshold of Magic 1: Clarke: Any sufficiently advanced technology is indistinguishable from magic.
Everybody has a threshold, beyond which Functionality is Indistinguishable From Magic1!
Ch i l S t Chemical Systems Biological Systems Economic Systemsy Electronic Systems
The Incandescent Light:The Incandescent Light:is the for most non-scientific, but well educated people!but well educated people!
... Its not a crime, to Not Understand Technology!
... The crime is not realising people don’t when you
28
are the one who suffers as a result!
AllAll Technologies Are Technologies Are Important ...Important ...pp
29
Exciting Technology ... At the ModuleInside the Case
iPhone 4's vibrator motor. rear-facing 5 MP camera with 720p video at 30 FPS, tap to focus feature, and LED flash.,
30 Source ... http://www.ifixit.com
Exciting Technology ... At the ModuleInside the Case
The Control Board.
31 Source ... http://www.ifixit.com
Exciting Technology ... Inside the ModuleInside The Control Board (a-side) Visible Design-Team Members...
A4 P ifi d b A l d i d d f t d b S A4 Processor, specified by Apple, designed and manufactured by Samsung ... The central unit that provides the iPhone 4 with its GP computing power. Inc. ARM A8 600 MHz CPU (also other ARM CPUs and IP?)
ST-Micro (3 axis gyroscope) - (ARM Partner) Broadcom (Wi-Fi, Bluetooth, and GPS) - (ARM Partner) Skyworks (GSM)
GPS
Triquint (GSM PA) Infineon (GSM Transceiver) - (ARM Partner)
GPS
Bluetooth, EDR &FM
32 Source ... http://www.ifixit.com
Exciting Technology ... Inside the Module
Inside The Control Board (b-side)
Visible Design-Team Members ... Samsung (flash memory) - (ARM Partner) Cirrus Logic (audio codec) - (ARM Partner)g ( ) ( ) AKM (Magnetic Sensor) Texas Instruments (Touch Screen Controller and mobile DDR) - (ARM Partner)
Invisible Design-Team Members ...g OS & Drivers, GSM Security; Graphics, Video and Sound ... Manufacturing, Assembly, Test, Certification ...
33 Source ... http://www.ifixit.com
Exciting Technology ... Inside ‘The Chip’
Memory ‘Package’
P SOC Di
2 Memory Dies‘Package’
Processor SOC DieGlue4-Layer Platform
Package’
The A4 SIP Package (Cross-section)
Package
The A4 SIP Package (Cross section) The processor is the centre rectangle. The silver circles beneath it are solder balls. Two rectangles above are RAM die, offset to make room for the wirebonds.
Putting the RAM close to the processor reduces latency, making RAM faster and cuts power.g p y, g p Unknown Mfr (Memory) Samsung/ARM (Processor) Unknown (SIP Technology)
34 Source ... http://www.ifixit.com
The Phone: Hetrogeneous Computation ...
• About 20 Chips in a Smart-Phone
• Processing: • Audio, Video, RF,
Touch, Temperature, Orientation G ForceOrientation, G-Force, Magnetism, Power
• Core Functions: • GSM GPS WiFi• GSM, GPS, WiFi,
3/4G Net, BlueTooth• Application Functions:
• Applets Games Mail• Applets, Games, Mail, Diary, Address-book, etc.
... Multi-Processing before we... Multi Processing before we open the ‘App’n Processor’!
35
... Partitioning: The difference between a good and bad Product!
Commodity HMP In Qual. Today...Pocket ‘Super-Computer’ ... 10 Programmable Processors
4 A9 P (2 2) 10 000 MIP Block-Diagram for a typical 40nm Mobile Computing & Smart-Phone Platform Chip
4 x A9 Processors (2x2): ~10,000 MIP 4 x MALI 400 Fragment Proc: ~1Gp/s 1 x MALI 400 Vertex Processor 1 x MALI Video CoDec
Plus Dedicated Processors Smart MMUs Smart Interrupt ControllersSmart Interrupt Controllers Smart DMA Engines Smart QoS and Power Mgt Smart Cache & Memory Repair
Plus ... Customer Additions/Peripherals!
~15 Proc ~2GHz 1-2W... ~15 Proc., ~2GHz, 1-2W... Strong Application Focus
36
30nm Target Architectures
About 50MTrAbout 50MTr
About 50KTr
D li i 5 d (A hit t P Cl k)
37
... Delivering ~5x speed (Architecture + Process + Clock)
Th ’ D Th ’ D There’s Design; There’s Design; and there’s and there’s
Technical Design ...Technical Design ...
38
Design A Part-Formalised Process ... Partition and Refine until every Thread has identified an Established
(Reuse) path to Physical ImplementationTh C t t d V if Then Construct and Verify ...
K Li k f
Concept Phone Actual Phone
Known-Links from Model-to-Reality (Reuse)
Hig
h
AAA cellTFT-LCD
ElectronicsGaAs Front End
Reu
se
H
Baseband
Std Radio Chip
Hie
rarc
hy o
f
ARM CPUSignal Processing
MALI MPULo
w
HHW Support
FP Engine
39
Gates/MC.Code
F2Functional A l iF1
F2
F4
F5 Analysis
F3
F4
(F5)(F2)
Thre
ad
( )(F2)
HW1 HW2 HW3 HW4H d I t f
RTOS/Drivers(F1) (F3)
E ti Pl tf
Hardware Interface
Bus(es) Processor(s)
40
Execution Platform
The Real-Time Execution of Models It is about creating a Functional-Model and an Execution-Platform
for it, to meet Functional and Non-Functional needs.D i A hi hi l th ti l f M d l R fi t d Design: A hierarchical mathematical process of Model Refinement and Verification; based on (Heuristic) Architectural decisions.
Implementation: Process of ‘bringing up’ and Validating the Functional-Model on the Physical Execution-Platform.
A good Solution is one thatA good Solution is one that ...1. Meets a valued human need2. Is Manufacturable to support a Competitive Price/Biz Model3. Works at least as well as your Competitors4. Scores well on Aesthetic (Non-Functional) criteria
A bad Solution is one where the Technology Shows!
41
And so to And so to MultiMulti--Processors ...Processors ...PP
42
The Argument for (C)MP Potentially Much Better Power Efficiency than Large/Fast Uni-Processor Power is a Major Problem ... On Die and In System. But 10x-100x improvement required!
Potential to deliver Higher Performance than Uni-Processor Can Amdahl’s Law be broken?Can Amdahl s Law be broken? For GP applications difficult to see improvement after 3/4 processors
Potential to handle Redundancy Schemes Many Processor ‘Tiles’ with NoC Connectivity Potentially a small % malfunction can be ‘re-routed’
Potential to offer a Scalable Standard Implementation Potential to offer a Scalable, Standard Implementation Reduces Chip Design/Masks/Qual’n/Production Cost by ~90% Needs a new (tbd!) GP Software Methodology Must work with Legacy (90% of designs are inherited).
... Can CMP actually deliver any (let alone all) of this?
43
... Can CMP actually deliver any (let alone all) of this?
Amdahl’s Law is Alive and Well
100% parallel32
24
28 Speedup on parallel processors is limited by thesequential portion of the program
20
peed
up Sequential portion need not be largeto significantly constrain speedup!
95% parallel12
16
Max
imum
sp to significantly constrain speedup!
90% parallel8
M
75% parallel50% parallel
0
4
0 4 8 12 16 20 24 28 32
44
0 4 8 12 16 20 24 28 32Number of cores
Many (C)MP Technologies ... even in the UK Transputer (Inmos 1978)
Highly Parallel Apps (Graphics) Lots of history; no success ...y
Pixelfusion – 1,536 processors/chip
Clearspeed –p 192 full 64 bit arch/chip
Picochip – Aimed at 3G Pico-Cells
Spinnaker – 18 processors (Scal. to 1^6) 18,000 neurons (Scal. To 10^9)
XMOS – New version of Transputer
Occam HandleC – HW synthesys (See also SystemC)
OpenCL – Smartphone's / Tablets (GPGPU)
45
... Success(?) requires a End-Product and Market of appropriate Scale.
OpenCL Enables Heterogeneity
Plug-in architecture Mali-T604 Vithar GPU ARM/NEON Custom Device Custom Device Video Decoder
Discovery of computational units Scheduling of work
... It does not automatically solve “which computation where”
46
Architecture: A Viable Mix of Technology YES: Power is a Major Concern ... Power-Efficiency the way of recovering it.
Away from.. ..towards.. ..wherever possible.
But so is Productivity, NRE Cost, TTM, Quality ... Reuse: As much as possible: Reuse: As much as possible:
Mech, Elect, SW, Acoustic, RF, Stacks, OS, Displays, Keyboards, etc. Teams: Use people who know how to do the work (duh!) Use External Expertise: It is seldom a differentiating factor in your Product. Producible: Make something that can be economically made (duh!) Performance: Competitive; don’t push the bounds of possibility.Performance: Competitive; don t push the bounds of possibility. New Technology: As little as possible.
And so are Aesthetics ... Colour, Style, Package, Availability, Quality, Business Model, etc ...
... Remember: The Product is the way to deliver
47
a Compelling End-Customer Experience.
Conclusions Multi-Processing makes sense in lots of Products today... It will seldom be entire solutions (ie: Small markets) It will seldom be homogeneous If the Work-Load and Programming Models are good.
Physical Concurrency makes sense in lots of ProductsPhysical Concurrency makes sense in lots of Products... Mechanical, CPU/GPU, Optical, RF, MEM, SAW, etc Simplifies Productivity, Design, Qualification, Quality and Reuse
Few Products (None) have the luxury of a Clean-Sheet Design... Legacy is unavoidable
CPU is not the answer for everything CPU is not the answer for everything... ‘Software’ is amongst the least energy efficient technology DSP, Video HW and GPU can be (much) better But Analogue and Mechanical are best
Products can be Enabled or Disabled MP Technology
48
... Products can be Enabled or Disabled MP Technology
Th ENTh ENThe END ...The END ...Th k f LTh k f LThanks for ListeningThanks for Listening
49
Reading & References
The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail (Disruptive Tech.) by Clayton M. Christensen: HBS Press, 1997
Open Innovation: The New Imperative for Creating and Profiting from Technology (Research in 21C)b H Willi Ch b h HBS P 2003 by Henry William Chesbrough : HBS Press, 2003
The World Is Flat (Globalisation) by Thomas L. Friedman: Penguin, 2005
Staying Power (Business)y g ( ) by Michael Cusumano: Oxford, 2010
A Short History of Nearly Everything (A different view on what we know) by Bill Bryson: Black Swan, 2003y y
The Voyages of the Beagle (Scientific Observation) By Charles Darwin,1860
An Essay on the Principles of Population (Natural Competition)B Th M l h 1789 By Thomas Malthus,1789
50
Recommended