Upload
jumpen-suksapar
View
213
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Â
Citation preview
2008 IBM CorporationIBM CONFIDENTIAL
2008 IBM Corporation
IBM Systems and Technology Group
Cell/B.E. processor-based systems and software offeringsIBM BladeCenter QS22 and SDK 3.0
IBM Systems and Technology Group
2008 IBM Corporation2 Sales Conference
The challenge today
For many years, organizations have relied on performance gains from increasing clock speeds of traditional microprocessor architectures
This approach has been challenged by the physical limitations ofsemiconductors and by traditional processor architecture implementations
High performance computing (HPC) applications need a fundamentally new technology and approach to the system-level architecture to achieve the desired level of performance.
IBM Systems and Technology Group
2008 IBM Corporation3 Sales Conference
Cell Broadband Engine (Cell/B.E.) Technology
IBM, Sony, Toshiba Alliance formed in 2000
March, 2001 STI Design Center opened in Austin, TX
April, 2004 - Single Cell BE operational
July, 2004 - 2-way SMP operational
February, 2005 - first technical disclosures at ISSCC
May, 2005 - first public demonstration of Cell/B.E. processor-based system at E3
August, 2005 - published technical details of Cell/B.E. architecture
November, 2005 - published open source SDK & Cell/B.E. simulator
August, 2006 - introduced the very first Cell/B.E. processor-based server to the market
For a higher of absolute performance and efficiency
IBM Systems and Technology Group
2008 IBM Corporation4 Sales Conference
IBM commitment to innovation
2006
2008
2007Produce systems for early adoption and solution enablement
Create initial platforms for experimentation
BladeCenter QS21
IBM SDK forMulticoreAcceleration 3.0
BladeCenter QS20Produce robust production ready systems for targeted industry applications
IBM BladeCenter QS22Extraordinary double precision floating point performance. Large memory capability. Ready for the most demanding production applicationsPowerXCell 8i processor
IBM Systems and Technology Group
2008 IBM Corporation5 Sales Conference
Cell Broadband Engine Architecture (CBEA) Technology Roadmap
20102009200820072006
PerformanceEnhancements/Scaling
CostReduction
All future dates and specifications are estimations only; Subject to change without notice. Dashed outlines indicate concept designs.
ConceptCommitted
Compatible code and security base across entire lineCompatible code and security base across entire line
Cell/B.E.(1+8)
90nm SOI
IBM PowerXCell 8i(1+8eDP SPE)
65nm SOI
Cell/B.E.(1+8)
65nm SOI
IBM PowerXCell 32ii
45nm SOI
Cell/B.E.(1+8)
45nm SOI
IBM Systems and Technology Group
2008 IBM Corporation6 Sales Conference
IBM PowerXCell 8i processor benefits
Sets a new performance standard Accelerates computationally intense workloads such as
analytics, multimedia and vector processing. Efficient computation per watt
Designed for flexibility Wide variety of application domains Cell can cover a wide range of application space with its
capabilities in floating point operations, integer operations data streaming / throughput support real-time support
Exploits C/C++, Fortran programming models Enhanced security capability
Virtual trusted computing environment for security
The new PowerXCell 8i processor builds on the Cell Broadband Engine Architecture and combines a general-purpose Power Architecture core of modest performance with eight enhanced synergistic processing elements optimized for extreme double precision and single precision computational performance
PowerXCell 8i processor 65 nm 9 cores, 10 threads 230.4 GFlops peak (SP) at 3.2GHz 108.8 GFlops peak (DP) at 3.2GHz Up to 25 GB/s memory bandwidth Up to 75 GB/s I/O bandwidth 92 Watts @ 3.2GHz Top frequency >4GHz
(observed in lab)
PowerXCell 8i processor 65 nm 9 cores, 10 threads 230.4 GFlops peak (SP) at 3.2GHz 108.8 GFlops peak (DP) at 3.2GHz Up to 25 GB/s memory bandwidth Up to 75 GB/s I/O bandwidth 92 Watts @ 3.2GHz Top frequency >4GHz
(observed in lab)
IBM Systems and Technology Group
2008 IBM Corporation7 Sales Conference
Intels x86 Quad Core processors are Dual Chip Modules (DCMs), 2 of these processor
stacked vertically & packaged together
PowerXCell 8i uses the space & power and delivers more than 2.3x the GFlops of traditional architecture
On any traditional processor, shown ratio of cores to cache, prediction, & related items
illustrated here remains at ~50% of area the chip area
Example Server Dual Core
349mm2, 3.4 GHz @ 150W2 Cores, ~27.2 SP GFlops1.3b Transistors @ 65nm
Example Desktop Quad Core
214 mm, 3 GHz @ 130W4 Cores, ~96 SP GFlops
820m Transistors @ 45nm
PowerXCell 8i Nine Core
109 mm2 3.2 GHz@ 75W9 cores, ~ 230 SP GFlops,250m Transistors @ 65nm
IBM Systems and Technology Group
2008 IBM Corporation8 Sales Conference
BladeCenter QS22 PowerXCell 8i Core Electronics
Two 3.2GHz PowerXCell 8i Processors SP: 460 GFlops peak per blade DP: 217 GFlops peak per blade Up to 32GB DDR2 800MHz Standard blade form factor Support BladeCenter H chassis
Integrated features Dual 1Gb Ethernet (BCM5704) Serial/Console port, 4x USB on PCI
Optional Pair 1GB DDR2 VLP DIMMs as I/O buffer
(2GB total) (46C0501) 4x SDR InfiniBand adapter (32R1760) SAS expansion card (39Y9190) 8GB Flash Drive (43W3934)
DDR2
DDR2
DDR2
DDR2
PowerXCell 8i
DDR2
PowerXCell 8i
2 UART, SPI
Rambus FlexIO
PCI-E x16PCI-X
PCI-E x8
HSC *12x PCI-E
x16
PCI
Leg
acy C
on
USB toBC mid plane
GbE toBC mid plane
2x1GbE
SPI
Optional IB2 port
IB x4 HCA
HSDC
IB-4x toBC-H high speed fabric/mid plane
DDR2
DDR2
DDR2
DDR2
FlashDrive
DDR2
IBMSouthBridge
4xUSB2.0
Flash, RTC& NVRAM
IBMSouthBridge
DDR2
*The HSC interface is not enabled on the standard products. This interface can be enabled on customsystem implementations for clients by working with the Cell services organization in IBM Industry Systems.
IBM Systems and Technology Group
2008 IBM Corporation9 Sales Conference
Performance highlights
Performance is an order of magnitude better than general purposeprocessors (GPP) for media and certain applications that can take advantage of its Single Instruction Multiple Data (SIMD) capability
Performance of its simple Power Processor Element (PPE) is comparable to a traditional GPP performance
Each Synergetic Processor Element (SPE) is able to perform mostly the same as a GPP running at the same frequency
Key performance advantage comes from its eight de-coupled SPE engines with dedicated resources including large register files and DMA channels
Accelerates targeted applications with extraordinary processing capabilities Floating-point operations Integer operations Data streaming / throughput support Real-time support
Open architecture allows for optimization at compiler and application level Performance gains from tuning compilers and applications can be significant Tools/simulators are provided to assist in performance optimization efforts
IBM Systems and Technology Group
2008 IBM Corporation10 Sales Conference
IBM BladeCenter QS22
QS22 is the RIGHT choice for intensive streaming and/or single and double precision floating point workloads
QS22 is OPEN based on Power Architecture and running Linux OS
QS22 is EASY to deploy and to integrate into the existing IT infrastructure and/or workloads: Co-exist and complement all other Blade servers offerings (Intel, AMD, POWER) Ready to scale out and deploy in production environments
QS22 is GREEN more than 1.7 SP (or 0.8 DP) GFLOPS per watt.
Premier blade for HPC workloads
IBM Systems and Technology Group
2008 IBM Corporation11 Sales Conference
IBM SDK for Multicore Acceleration and related tools
Libraries and frameworks
IBM XL C/C++ compiler*Optimized compiler for use in creating Cell/B.E. optimized applications. Offers:
* improved performance * automatic overlay support * SPE code generation
AcceleratedLibrary
Framework (ALF)
DataCommunication
andSynchronization
(DaCS)
Basic LinearAlgebra
Subroutines (BLAS)
StandardizedSIMD math
libraries
GNU tool chainPerformance
Tools
The IBM SDK is a complete tools package that simplifies programming for the Cell Broadband Engine Architecture
XLC compiler is a
complementary product to SDK
Eclipse-based IDE
Simulator
Denotes software components included in the SDK for Multicore Acceleration
IBM Systems and Technology Group
2008 IBM Corporation12 Sales Conference
IBM SDK for Multicore Acceleration value
Designed to be highly reliable, simple toacquire and easy to use
Complete, integrated kit Production-ready tools from IBM IBM warranty and support
Based on industry standards to ease thetransition to the Cell/B.E.
Eclipse-based Integrated DevelopmentEnvironment
Standard, base libraries Third-party libraries can be plugged in
Designed to make it easy to port and optimize applications for the QS21 and QS22
Enhancements to enable new features in QS22 Performance tuning tools to help optimize algorithms without re-writing the entire application Tools designed to help you partition an application across a hybrid Cell/B.E. and x86 platform
IBM Systems and Technology Group
2008 IBM Corporation13 Sales Conference
Cell Programming Approaches are fully customizable!
3. Case Tools / CompleteHardwareAbstraction
User tool-driven
2. AssistedProgramming
Libraries,Frameworks
1. NativeProgramming
Compilers,Intrinsics,DMA, etc.
Increasing Programmer
Control over
Cell/B.E
. resources
Decreasing programmer
attention to
architectural details
IBM Systems and Technology Group
2008 IBM Corporation14 Sales Conference
Workloads ideal for PowerXCell 8i and QS22
DigitalMedia
Financial ServicesSector
Home MediaConsumer Electronics
Information Based
Medicine
Digital Video Surveillance
Aerospace and Defense
Electronic Design
Automation
Chemicals & Petroleum
Market & Solution Specific Assets
Real-time AnalyticsProcessing of Data
Information SynthesisAnalysis
Unstructured DataMultimodal SearchData TransformsPattern Matching
Image/Video Creation/MgtPresentation of Data
VisualizationImaging
Extreme Stream Computation and Bandwidth requirements
PowerXCell 8i is suited for applications which demand extraordinary floating point performance
IBM Systems and Technology Group
2008 IBM Corporation15 Sales Conference
Public sector HPC solutions
IBM components: IBM BladeCenter QS21 & QS22 IBM SDK for Multicore Acceleration IBM Cell/B.E. math libraries IBM hybrid computing solution (custom offering) PXCAB
ISV applications: Development tools from RapidMind, Gedae,
Wind River, etc. A growing number of university and government
research labs with external collaborative missions are exercising existing and emerging science codes
The solution is designed to offer: Petaflop Scalability and reliability Lower power and space footprint Lower total cost of ownership
Performance advantages: Science code such as SPaSM, VPIC, Milagro,
Sweep3D, accelerated up to 4-9X faster than AMD Opteron single core(Source: LANL - www.lanl.gov/roadrunner)
Enable government labs, agencies, and academic research centers to run high performance codes faster, less expensively, and with lower power consumption than existing computing architectures
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation16 Sales Conference
Aerospace & defense solutions
IBM components: IBM BladeCenter QS21 & QS22 IBM SDK for Multicore Acceleration IBM Cell/B.E. math libraries IBM hybrid computing solution (custom offering) PXCAB
ISV applications: Gedae stream, image and signal programming
environment RapidMind development tools Wind River VxWorks RTOS and WorkBench
Tools
Performance advantages: FFT workloads up to 7.7x faster than 3.0 GHz
2-core Woodcrest x2* Double Precision Matrix Multiplication up to
2.6x faster than 2.66GHz 4-core Clovertown*
Enhance competitiveness, demonstrate innovation and capture significant government contracts through dramatic performance improvements in real time signal and image processing
As a time-served radar architect, I can say that Cell/Gedae is something of a dream and should
rightly impact the new design market it is an opportunity that the DoD should not fail to grasp.
- John Roulston,SCImus Solutions, March 2007
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation17 Sales Conference
Digital content creation solutions
IBM components: IBM BladeCenter QS21 & QS22 IBM SDK for Multicore Acceleration IBM Cell/B.E. math libraries IBM hybrid computing solution (custom offering) PXCAB IBM iRT scalable real-time ray tracer
ISV applications RapidMind development tools
The solution is designed to offer: Rapid turn around of digital assets More realistic simulation An open and flexible solution based on standards Scalability and reliability
Performance advantages: 1080p Ray-traced images computed in
milliseconds* 1080p Ambient Occlusion images computed in
seconds*
IBM solutions enable Media and Entertainment companies to produce the next generation of animated feature films, games, and advertising content
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation18 Sales Conference
Digital video surveillance solutions
IBM components: IBM BladeCenter QS21/QS22 IBM Total Storage IBM DVS ADK
ISV applications: Codec libraries Video distribution software
The solution is designed to offer: H.264 encoding Encoders for analog cameras Transcoding to save storage and
network costs Decoding acceleration to reduce
workstation costs and improve robustness Better management and scalability Network-based surveillance Compute density - with two processors per
blade, 14 blades to a chassis, and two chassis to a rack, it is possible to have as many as 672 H.264 encoders in the rack
Performance advantage: One Cell/B.E processor running at 3.2 GHz,
can encode 12 channels of standard definition video at 30 fps to H.264 (main profile, including CABAC)[1][1] Source: IBM Research benchmark
Solutions deliver hardware and enablement for high-density, highly scalable encoding, transcoding, and compositing for digital video surveillance
IBM BladeCenter QS21/QS22
PTZ
Coax16 camera inputs
16 camera inputs
Aggregation Unit
14 card slotsIBM BladeCenter-HIBM Total Storage
672 encoders in
a rack!
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation19 Sales Conference
EDA solutions
IBM components: Cell/B.E. hybrid cluster IBM BladeCenter QS21 IBM System x / IBM BladeCenter IBM Cluster 1350 integrated cluster Storage: DS4000, N series, DCS9550
ISV applications: Mentor Graphics Calibre nmOPC
and OPCVerify
The solution is designed to offer: Significant run time acceleration
Leverages Cell/B.E. strengths to offer significant speed-up when compared to existing solutions in the market, reducing design turnaround time
Scalability and reliability Blade form factor improves scalability,
compute density and reliability
Accelerate computational lithography workload to address turnaround time challenges and at the same time reduce total cost of the computing infrastructure
IBM Systems and Technology Group
2008 IBM Corporation20 Sales Conference
Financial market analytics solutions
IBM components: IBM BladeCenter QS22 IBM SDK for Multicore Acceleration Dynamic Application Virtualization Cell/B.E. math libraries
ISV applications: NAG - Math & Stat Software Platform Symphony -Grid Computing
Environment Encirq Event Processing Platform
The solution is designed to offer: Flexibility and Scalability
IBM Bladecenter QS22 integrates with other Bladecenter Products
IBM SDK, DAV, third party applications for ease of adoption within existing infrastructure
Technical Services with skilled programming expertise and subject matter experts
Power, space and cooling advantages
Performance advantage Collateralized Debt Obligation (CDO) - 7.5X faster than
2.8 GHz 4-core Harpertown* 650 million European options /sec using Monte Carlo
simulations on QS22 blade*
Enable financial market professionals to perform the required speed, accuracy and highly complex analytics to support trade execution and improve their firms competitive position
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation21 Sales Conference
Medical imaging solutions
IBM components: IBM BladeCenter QS21 & QS22 IBM SDK for Multicore Acceleration IBM Cell/B.E. math libraries IBM hybrid computing solution (custom offering) PXCAB
ISV applications: Advanced image and text analytics High-performance image compression
The solution is designed to offer: 3D image reconstruction, registration, volume
rendering, segmentation On-demand compression/decompression
Performance advantage: 16x improvement on MRI image reconstruction
over Opteron system 11x improvement on CT image reconstruction
over 3.0GHz Xeon system 48x improvement on image registration over
3GHz Pentium 4 200x shear-warp volume visualization over TI
TMS320C80 processor 40:1 CT study data compression
(Source for all above: Mayo Clinic -http://www.mayoclinic.org/news2007-rst/3996.html )*
Improve the efficiency, productivity, and quality of patient care through dramatic performance improvements in the transmission and analysis of medical images
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation22 Sales Conference
Seismic solutions
IBM components: IBM BladeCenter QS22 IBM SDK for Multicore Acceleration IBM Cell/B.E. math libraries IBM hybrid computing solution (custom offering) PXCAB
Standard math, vector math, FFT, BLAS, MPI and tridiagonal solver
ISV applications: Simudyne Customers own proprietary code
The solution is designed to offer: High-performance highly accurate rendering
of geologic structures Cost effective HPC environment that has
significant performance increases Scalability and reliability
Performance advantages: FFT workloads up to 7.7x faster than 3.0
GHz 2-core Woodcrest x2* Double Precision Matrix Multiplication up to
2.6x faster than 2.66GHz 4-core Clovertown*
Improve the speed and accuracy of geologic visualization to reduce the cost of evaluatingpotential targets for oil and gas yielding potential
*See Notes on Benchmarks, charts 46 and 47
IBM Systems and Technology Group
2008 IBM Corporation23 Sales Conference
QS22 summary
The QS22 is based on the new PowerXCell 8i processor built on an enhanced version of the Cell Broadband Engine Architecture
The QS22 offers the capabilities you need for your most demanding computational requirements
Offers extraordinary double precision and single precision floating point performance
Supports up to 32GB of processor memory
IBM is working with ISVs and customers to accelerate workloadson the QS22 in targeted application areas
The QS22 is extremely efficient, offering more than 1.7 SP (or 0.8 DP) GFLOPS per watt of energy
BladeCenter QS22 is Right, Open, Easy and Green
Premier blade for HPC workloads
IBM Systems and Technology Group
2008 IBM Corporation24 Sales Conference
IBM SDK for Multicore Acceleration summary
Designed to be highly reliable, simple to acquire and easy to use Complete, integrated kit Production-ready tools from IBM IBM warranty and support
RHEL 5.2 Enterprise support
Based on industry standards to ease the transition to the Cell/B.E. architecture
Eclipse-based Integrated Development Environment Standard, base libraries Third-party libraries can be plugged in
Designed to make it easy to port and optimize applications for the QS22 Performance tuning tools to help optimize algorithms without re-writing the entire application Tools designed to help you partition an application across a hybrid Cell/B.E. and x86 platform
IBM Systems and Technology Group
2008 IBM Corporation25 Sales Conference
Cell/B.E. architecture reaches wide and deep from consumer products to high performance computing
SCE PS3(Cell/B.E. + GPU)
IBM BladeServer(2 Cell/B.E. orPowerXCell 8i)
Roadrunner(16,000
PowerXCell 8i. + AMD)
Mercury 1u Dual CellSony Cell/B.E. Computing Unit
(Cell/B.E. + GPU + AV I/O)
Consumer Business
High Performance ComputingEnterprise
PowerXCell 8i PCI card
(Cell/B.E. + Host)
Common OSs, Infrastructure, Tools, Libraries, Codethe SAME SPE code runs from end to end
Toshiba SpursEngine
(SPUs. + Host)
Mini-Roadrunner
Custom
Increas
ing supp
ort for s
cale
and
datacen
ter
Increas
ing supp
ort for s
cale
and
datacen
ter