View
187
Download
2
Category
Preview:
Citation preview
TAIPEI | SEP. 21-22, 2016
Benedict Khoo (FAE Manager, APAC Region), September 21st 2016
We Keep IT Green™"Earth-friendly" Solutions
EVOLUTION OF SUPERMICRO GPU SERVER SOLUTION
2
INNOVATION• Server Building Blocks methodology• Application optimized• First to market
ABOUT THE COMPANY…
Our DNA…
OPEN PLATFORM• True open platform• Commoditization with innovation
ENERGY EFFICIENCY• Excellency on thermal / cooling design• Titanium power supply• Perfection in green computing
3
Supermicro® (NASDAQ:SMCI) is a global leader in high-performance, high-efficiency server technology and innovation. We develop and provide end-to-end green computing solutions to the datacenter, cloud computing, enterprise IT, big data, HPC, and embedded markets. Our solutions range from complete server, storage, blade and workstations to full racks, networking devices, server management software and technology support and services.
We perform the majority of our R&D efforts in-house, which increases the communication and collaboration between design teams, streamlines the development process and reduces time-to-market. We have developed a set of design principles which allow us to aggregate individual industry-standard components and materials to develop truly optimized server boards, chassis, power supplies, networking and storage devices. This building block approach allows us to provide a broad range of SKUs, and enables us to build and deliver application-optimized solutions based upon customers' requirements.
4
KEY ADVANTAGES FEATURES
Widest Range of Supported Solutions (up to 7U)…
Highest Density Solutions which can support up to 10x GPGPU Solutions per Node
Maximizing performance per Watts, per Sq. Ft., per Dollar designs…
Unique Green Computing Architecture features…
Full Bandwidth capable for optimal I/O performance…
http://www.supermicro.com.tw/white_paper/white_paper_1U_4GPU_Server.pdf
Offer up to 50+ Sku Solutions…
5
GPU Solutions – HPC/Grid Optimized
Tesla S1070
PCI-E x16
1U 4-GPU Standalone
Server
2008 – GPGPUWhere it started…
1U Twin™
2009 – Hybrid Computing Pioneer
Integrated GPU Server
The fastest 1U server
in the world
2011 – GPU Blades
GPU Server & Workstation
2015 – 1U 4-GPU Optimized7U 10-blade 20-GPUs
üHigh efficiency power supplies at full capacityüExcellent thermal designüNon-blocking air-flowüGreatest performance layoutüNo re-driver required; no latency
“The most comprehensive product line in the Industry”
2013 – GPU FatTwin™
2016 – Next GenGPU Innovation - Latency and Performance Optimized
6
SUPERMICRO SUPERBLADE REVIEWHTTP://WWW.SERVETHEHOME.COM/SUPERMICRO-SUPERBLADE-GPU-SYSTEM-REVIEW-SBE-710Q-R90-CHASSIS/
Using the Supermicro GPU Super-Blade platform we quickly saw the benefits in terms of: higher density, higher power supply efficiency, easier maintenance, significantly reduced cabling, and easier upgrades/ expansion.
We were impressed by how easy it was to use and manage the system.
7
3rd Party Server Reviews
97%Rating:
“As we have said before, the case used for the 7048GR-TR Workstation is simply the best in quality, craftsmanship, and features. … it is our go-to caseevery time. The 7048GR-TR Workstation is designed for maximum uptime with hot-swappable drives and cooling fans, and includes dual redundant power supplies.”
— TweakTown
9.7Rating:
“Overall, for those looking to cram four GPU’s into a small 1U form factor for dense compute or even VDI applications, the Supermicro 1028GQ-TRT is an excellent solution. With 10Gbase-T networking, the server is easy to integrate into existing datacenter infrastructure so long as the rack is able to handle higher-power rated gear.
… we find the 4028GR-TR is a well designed system that has the ability to handle high performance work loads. Moving to a large 4U server allows larger capacity cooling systems to be installed that keep the system cool while running extreme work loads. This is a trade off vs smaller 1U systems which have higher density but operate at close to maximum heat load capacities.”
— ServeTheHome
8
STAC-A2 BenchmarksThe STAC-A2 Benchmark suite is the industry standard for testing technology stacks used for compute-intensive analytic workloads involved in pricing and risk management. In all, the STAC-A2 specifications deliver nearly 200 test results related to performance, scaling, efficiency, and quality, which are detailed in this report.
Test System: Supermicro SYS-1028GR-TR server
World Record Results Fastest warm time to date in the baseline end-to-end Greeks benchmark: GREEKS.TIME.WARM; This was 1.27x the speed of the next fastest System (SUT ID: INTC150811).
9
https://www.supermicro.com.tw/products/nfo/Green500.cfm
Supermicro 1U GPU Solution at GSIC Center
- Ranked 1st on the World's Green500 List of Computer Systems
10
Optimized Portfolio with Highest Rack-level GPU Density
Best–in-class technology designed for highly parallel applications to deliver ultimate performance, flexibility, and scalability
1018GR
Single Haswell/Broadwell CPU 8 DDR4 DIMMs6x 2.5” HS HDD bays2 Double-Width GPUs1 x8 PCIe 3.0 slot1x 1400W Platinum PWS
Cost Effective
1028GR
Dual Haswell/Broadwell CPUs16 DDR4 DIMMs4x 2.5” HS HDD bays3 Double-Width GPUs1 x8 PCIe 3.0 slot2x 1600W Platinum PWS
Mainstream
1028GQ
Dual Haswell/Broadwell CPUs16 DDR4 DIMMs4x 2.5” HS HDD bays4 Double-Width GPUsActive/Passive GPUs2 x8 PCIe 3.0 Slots2x 2000W Platinum PWS
Parallel Optimized
432
11
Optimized Portfolio with Highest Node-level GPU Density
Best–in-class technology designed for highly parallel applications to deliver ultimate performance, flexibility, and scalability
7048GR
4U ChassisDual Haswell/Broadwell CPUs w/ IPMI16 DDR4 DIMMs8x 3.5” HS HDD bays4 Double-Width GPUsx16/x8/x4 – 4/2/1**2x 2000W Titanium PWS
Mission Critical
2028GR
2U ChassisDual Haswell/Broadwell CPUs16 DDR4 DIMMs10x 2.5” HS HDD bays6 Double-Width GPUs1 x8 PCIe 3.0 slot2x 2000W Platinum PWS
Mainstream
4028GR
4U ChassisDual Haswell/Broadwell CPUs24 DDR4 DIMMs24x 2.5” HS HDD bays8 Double-Width GPUs2 x8 PCIe 3.0 slot; 1 x4 PCIe 2.0 slot4x 1600W Platinum PWS
Parallel Optimized
864
12
Widest PortfoliosRACKTOWER MULTI-NODE
6:2 (2U)
4:2)
3:2 (1U)
2:1 (1U)
3:2 (4U/4Node)
4:2 (2U)
1:2 (1U)
1:2 (2U/2Node)
3*:2 (WS)
6:2 (4U / 2Node)
8:2 (4U)
4:2 (1U) 2:2 (7U / 10Node)
3:2 (2U)3*:2 (4U)
HIGHER DENSITY
GPU
EN
ABLE
DG
PU O
PTIM
IZED
1:1 (WS)
RATIO:GPU:CPU *Support MAX 2x Double Width GPU
13
THE LEADING SOLUTIONS (NEW)GPU Optimized Server Portfolio
New Generation High Performance Optimal Solutions…
14
CUSTOMER PAIN POINTS
Machine Learning / AI applications have large datasets well beyond one single GPU.
PROBLEM SOLUTION
Aggregate GPU resources to tackle large dataset computation, in conjunction with high speed connectivity to minimize latency
Generic ARCHITECTURE
QPI
PCIe
PCIe
Latency is a major bottleneck, based on many 8x GPU designsWith constant communication, the QPI + PCIe is a major constraint. Symmetric PCIe design is NOT efficient for Machine Learning Applications.
15
Highest Density NVIDIA GPU Solution
MAXWELL/PASCAL READY• Active/Passive GPU Support• Support latest Maxwell/Pascal GPUs• Support a 10 GPUs configuration
X10 SUPERMICRO ADVANTAGE● PERFORMANCE: GPUs under single CPU Root ● FLEXIBILITY: Supports up to 10x Active/Passive GPUs● GPU RDMA: Direct Internode GPU Interconnect● EFFICIENCY: Titanium-rated Power Supply ● DESIGN: No GPU preheating
ADVANTAGES• GPU compute unit on one ROOT can train twice as fast and explore networks twice as
large. • Distributed training across eight GPUs allows scaling to size and speed of the networks by
another factor of two
The most flexible parallel computing solution in the market. Optimized for GPU peering, this
architecture enables faster Machine Learning Training by up to with GPUs under a single CPU root!
Single Root Complex Designfor World Class Latency Optimized Solution
Super High Computing Capability Highest Performance/ Watts Capabilities
16
Optimized Solution With NVIDIA Pascal GPU Architecture
PASCAL GPU READY• Performance – 10 TFLOPs FP32• NVLink Advance Technology• 3D Memory - 2x Memory Bandwidth
X10 SUPERMICRO ADVANTAGE● PERFORMANCE: 8x PASCAL with GPUs IN 1U/ 4U ● NVLINK: 80GB/s High Bandwidth GPU Interconnect● RDMA FABRIC: 4x Direct Low Latency Data Access● EFFICIENCY: Titanium-rated Power Supply ● DESIGN: No GPU preheating
ADVANTAGES• All GPUs capable of Peer-to-Peer direct access to all other GPUs’ memory as well as
direct transfer (memcpy) operations via NVLink at high Bandwidth• High performance for collective communications• PCIe bandwidth fully available for host and/or NIC communication during inter-GPU
communication
Unparalleled 1U platform for the highest parallel applications. No one else can do so much in
a 1U!!!! Up to NDIDIA GPU with Pascal Architecture in , supporting Optimized GPU RDMA
TAIPEI | SEP. 21-22, 2016
THANK YOU
More Information Please Talk To Our Representatives
WWW.SUPERMICRO.COM/GPU
We Keep IT Green™"Earth-friendly" Solutions
Recommended