Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Cell Broadband Engine
Spencer DennisNicholas Barlow
The Cell Processor
◦ Objective: “[to bring] supercomputer power to everyday life”
◦ Bridge the gap between conventional CPU’s and high performance GPU’s
History
Original patent application in 2002Generations◦ 90 nm - 2005 ◦ 65 nm - 2007 (PowerXCell 8i)◦ 45 nm - 2009
Design
Cost $400 Million to developTeam of 400 engineersSTI Design Center◦ Sony◦ Toshiba◦ IBM
PS3Employed as CPU◦ Clocked at 3.2 GHz◦ theoretical maximum
performance of 23.04 GFLOPS
Utilized alongside NVIDIA RSX 'Reality Synthesizer' GPU
◦ Complimented graphical performance
Architecture Overview
◦ 8 Synergistic Processing Elements (SPE)
◦ Single Dual Issue Power Processing Element (PPE)
◦ Memory IO Controller (MIC)
◦ Element Interconnect Bus (EIB)
◦ Memory IO Controller (MIC)
◦ Bus Interface Controller (BIC)
SPU/SPE Synergistic Processing Unit/ElementSXU - Synergistic Execution Unit
LS - Local Store
SMF - Synergistic Memory Frontend
EIB - Element Interconnect Bus
PPE - Power Processing Element
MIC - Memory IO Controller
BIC - Bus Interface Controller
Synergistic Processing Element (SPE)
128-bit dual-issue SIMD dataflow○ “Single Instruction Multiple Data”○ Optimized for data-level
parallelism○ Designed for vectorized floating
point calculations.
SPE Continued
◦ Workhorses of the Processor
◦ Handle most of the computational workload
◦ Each contains its own Instruction + Data Memory
◦ “Local Store”▫ Embedded SRAM
Power Processor Element (PPE)
Responsible for governing SPEs◦ “Extensions” of the PPE
Shares main memory with SPE◦ can initiate accesses for SPE cores
Power Architecture◦ Implements Power Architecture Hypervisor
▫ can run multiple operating systems concurrentlyMemory (1st generation)◦ 32KB split L1 instruction & Data cache
▫ unified 512KB L2 Cache
Element Interconnect BusHigh bandwidth internal bus1st generation: 96 Bytes/cycle4 16B rings ◦ can handle up to 3 simultaneous data
transfers12 on and off ramps◦ Each SPE + PPE◦ memory controller◦ 2 Off-chip I/O interfaces
Memory Flow Controller
Asynchronous Memory Controller
Retrieves data from main memory to SPE’s local storage & PPE’s Cache.
Supports two Rambus XDR memory banks
Bus Interface ControllerProvides asynchronous interface between EIB and IO interfacesTwo flexible IO interfaces to rest of system◦ One Interface can be reconfigured to provide Symmetric Multiprocessing (SMP)
interfaceContains pervasive unit◦ provides test, debug and monitoring functionality
▫ Chip level error checking◦ provides clock generation & distribution control◦ Power on Reset Unit (POR)
▫ Responsible for unit initialization◦ Performance monitoring
Power Management Unit (PMU)◦ Allows software controlled power reduction
Thermal Management Unit (TMU)
Developing for CellOctopiler◦ Takes high level sequential code and parallelizes it to optimize it
for a multiprocessor system▫ High level languages
◦ Divides code nine ways▫ 8 sets of instructions are written for the SPE’s▫ The final set is written for the Power PC PPE
GCC◦ IBM sourced plugins for cell PPU/SPU development
SPU ISA
SPU ISA (cont’d)
Applications (In Depth)Console Gaming◦ PS3
▫ PPE controls 6 SPE’s delegating tasks▫ 1 SPE is OS reserved, 1SPE is redundant
Supercomputing◦ IBM BladeCenter QS Series
▫ Easy Scalability
Password cracking◦ High parallelism allows for high floating point brute force
performance
Conclusion
Discontinued in 2009◦ Difficult development environment
▫ Programmer managed SPE memory▫ Explicit parallelism▫ Two separate ISAs
Idea still lives on…◦ General Purpose GPU
▫ Intel Larabee Architecture Intel Many Integrated Core Architecture
▫ AMD FireStream
▫ Nvidia Tesla
References
◦ https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/76CA6C7304210F3987257060006F2C44/$file/SPU_ISA_v1.2_27Jan2007_pub.pdf
◦ http://en.wikipedia.org/wiki/SIMD◦ http://en.wikipedia.org/wiki/Cell_(microprocessor)◦ ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1564359◦ http://arstechnica.com/uncategorized/2006/02/6265-2/◦ http://www2.lbl.gov/Science-
Articles/Archive/sabl/2006/Jul/CellProcessorPotential.pdf◦ http://en.wikipedia.org/wiki/Symmetric_multiprocessing◦ http://researcher.watson.ibm.com/researcher/view.php?person=us-
mkg/papers/2006_ieeemicro.pdf