29
Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 | Nov. 16, 2015 ACCELERATED COMPUTING: THE PATH FORWARD

ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 | Nov. 16, 2015

ACCELERATED COMPUTING: THE PATH FORWARD

Page 2: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

COMMODITY DISRUPTS CUSTOM

SOURCE: Top500

Page 3: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

ACCELERATED COMPUTING:

THE PATH FORWARD

“ It’s time to start planning for the end of Moore’s Law, and it’s worth pondering how it will end, not just when.”

Robert Colwell

Director, Microsystems Technology Office, DARPA

Page 4: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

NVIDIA ACCELERATES COMPUTING

Productive Programming Model & Tools

Expert Co-Design

Accessibility

APPLICATION

MIDDLEWARE

SYS SW

LARGE SYSTEMS

PROCESSOR 0.0

0.5

1.0

1.5

2.0

2.5

3.0

2008 2009 2010 2011 2012 2013 2014

NVIDIA GPU x86 CPU

Fast GPU Engineered for High Throughput

TFLOPS

M2090

M1060

K20

K80

K40

Fast GPU +

Strong CPU

Page 5: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

0

25

50

75

100

125

2013 2014 2015

100+ accelerated systems now on Top500 list

1/3 of total FLOPS powered by accelerators

NVIDIA Tesla GPUs sweep 23 of 24 new

accelerated supercomputers

Tesla supercomputers growing at 50% CAGR

over past five years

Top500: # of Accelerated Supercomputers

ACCELERATORS SURGE IN WORLD’S TOP SUPERCOMPUTERS

Page 6: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

MACHINE LEARNING HPC’S 1ST CONSUMER KILLER-APP

“NADELLA: SMART AGENTS LIKE CORTANA WILL REPLACE THE WEB BROWSER” -BI

FACEBOOK MESSENGER ADDS FACIAL RECOGNITION

YOUTUBE: CLICK-TO-BUY ADS

GOOGLE PHOTOS: ML-POWERED FEATURES

MICROSOFT OPEN-SOURCES DMTK

GOOGLE OPEN-SOURCES TENSORFLOW

Page 7: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA FOR MACHINE LEARNING

10M Users 40 years of video/day

270M Items sold/day 43% on mobile devices

TESLA M4 TESLA M40

HYPERSCALE SUITE

POWERFUL: Fastest Deep Learning Performance LOW POWER: Highest Hyperscale Throughput

GPU Accelerated FFmpeg

Image Compute Engine

GPU REST Engine

Page 8: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

MACHINE LEARNING REVOLUTIONIZING TRANSPORTATION

“Toyota Invests $1 Billion in Artificial Intelligence in U.S.”

— U.S. News & World Report

Page 9: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

39%

45%

55%

62%

66%

72% 75%

79%

83%

30%

40%

50%

60%

70%

80%

90%

100%

7/8 7/22 8/5 8/19 9/2 9/16 9/30 10/1410/28

END-TO-END MACHINE LEARNING PLATFORM FOR AUTONOMOUS CARS

NVDRIVENET on KITTI Object Detection

BAIDU

11/10

DRIVE PX

DIGITS DevBox

NVIDIA

Page 10: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

MACHINE LEARNING REVOLUTIONIZING AUTONOMOUS MACHINES

Page 11: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

JETSON TX1 Supercomputer

on a Module

10x Energy Efficiency Alexnet

GPU 1 TFLOPS

256-core Maxwell

CPU 64-bit ARM A57s

Memory 4GB LPDDR4

26 GB/s

Power Under 10W

0

10

20

30

40

50

Intel Core i7-6700K (Skylake)

Jetson TX1

Images

/ S

ec /

Watt

Under 10W for typical use cases

Page 12: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

PC GAMING

SUPERCOMPUTING EVERYWHERE

Titan X for PC

Tesla in the Cloud

Jetson TX1 for Robots

DRIVE PX for Auto

Page 13: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

Ian Buck, VP of Accelerated Computing, NVIDIA SC15 | Nov. 16, 2015

ACCELERATED COMPUTING: THE PATH FORWARD

Page 14: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA ACCELERATES DISCOVERY AND INSIGHT

270M Items sold/day 43% on mobile devices

SIMULATION

TESLA ACCELERATED COMPUTING

VISUALIZATION MACHINE LEARNING

Page 15: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

“Approximately a third of HPC

systems operating today are

equipped with accelerators

and nearly half of all newly

deployed systems have them.”

ACCELERATED COMPUTING: A TIPPING POINT FOR HPC, Intersect360 Nov 2015

Page 16: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

70% OF TOP HPC APPS NOW ACCELERATED

VASP NOW ACCELERATED

Typically Consumes

10-25%

of HPC System

INTERSECT360 SURVEY OF TOP APPS

Top 10 HPC Apps 90%

Accelerated

Top 50 HPC Apps 70%

Accelerated

1 Dual K80 Server 1.3x

4 Dual CPU Servers 1.0x

Intersect360, Nov 2015 “HPC Application Support for GPU Computing” Dual-socket Xeon E5-2690 v2 3GHz, Dual Tesla K80, FDR InfiniBand Dataset: NiAl-MD Blocked Davidson

Page 17: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

370 GPU-Accelerated Applications

www.nvidia.com/appscatalog

Page 18: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA FOR SIMLUATION

LIBRARIES

TESLA ACCELERATED COMPUTING

LANGUAGES DIRECTIVES

ACCELERATED COMPUTING TOOLKIT

Page 19: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA K80 World’s Fastest Accelerator

for HPC 0 5 10 15 20 25 30

Tesla K80 Server

Dual CPU Server

# of Days

AMBER Benchmark: PME-JAC-NVE Simulation for 1 microsecond CPU: E5-2698v3 @ 2.3GHz. 64GB System Memory, CentOS 6.2

CUDA Cores 2496

Peak DP 1.9 TFLOPS

Peak DP w/ Boost 2.9 TFLOPS

GDDR5 Memory 24 GB

Bandwidth 480 GB/s

Power 300 W

Simulation Time from 1 Month to 1 Week

5x Faster AMBER Performance

Page 20: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

APPLICATION PERFORMANCE BOOSTS DATA CENTER THROUGHPUT

TESLA K80: 5X FASTER 1/3 OF NODES ACCELERATED, 2X SYSTEM THROUGHPUT

100 Jobs Per Day 220 Jobs Per Day

CPU-only System Accelerated System

0x

5x

10x

15x

QMCPACK LAMMPS CHROMA NAMD AMBER

K80 CPU

CPU: Dual E5-2698 [email protected] 3.6GHz, 64GB System Memory, CentOS 6.2 GPU: Single Tesla K80, Boost enabled

Speed-up vs Dual CPU

Page 21: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

OPENACC DELIVERS TRUE PERF PORTABILITY Paving the Path Forward: Single Code for All HPC Processors

4.1x 5.2x

7.1x

4.3x 5.3x 7.1x 7.6x

11.9x

30.3x

0x

5x

10x

15x

20x

25x

30x

35x

359.MINIGHOST (MANTEVO) NEMO (CLIMATE & OCEAN) CLOVERLEAF (PHYSICS)

CPU: MPI + OpenMP CPU: MPI + OpenACC CPU + GPU: MPI + OpenACC

Speedup v

s Sin

gle

CPU

Core

Application Performance Benchmark

359.miniGhost: CPU: Intel Xeon E5-2698 v3, 2 sockets, 32-cores total, GPU: Tesla K80- single GPU NEMO: Each socket CPU: Intel Xeon E5-‐2698 v3, 16 cores; GPU: NVIDIA K80 both GPUs CLOVERLEAF: CPU: Dual socket Intel Xeon CPU E5-2690 v2, 20 cores total, GPU: Tesla K80 both GPUs

Page 22: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA HYPERSCALE FOR MACHINE LEARNING

10M Users 40 years of video/day

270M Items sold/day 43% on mobile devices

TESLA M4 TESLA M40

HYPERSCALE SUITE

POWERFUL: Fastest Deep Learning Performance LOW POWER: Highest Hyperscale Throughput

GPU Accelerated FFmpeg

Image Compute Engine

GPU REST Engine

Page 23: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA M40 World’s Fastest Accelerator

for Deep Learning 0 1 2 3 4 5 6 7 8 9

Tesla M40

CPU

8x Faster Caffe Performance

# of Days

Caffe Benchmark: AlexNet training throughput based on 20 iterations, CPU: E5-2697v2 @ 2.70GHz. 64GB System Memory, CentOS 6.2

CUDA Cores 3072

Peak SP 7 TFLOPS

GDDR5 Memory 12 GB

Bandwidth 288 GB/s

Power 250W

Reduce Training Time from 8 Days to 1 Day

Page 24: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA M4 Highest Throughput

Hyperscale Workload Acceleration

CUDA Cores 1024

Peak SP 2.2 TFLOPS

GDDR5 Memory 4 GB

Bandwidth 88 GB/s

Form Factor PCIe Low Profile

Power 50 – 75 W

Video Processing

4x

Image Processing

5x

Video Transcode

2x

Machine Learning Inference

2x

H.264 & H.265, SD & HD

Stabilization and Enhancements

Resize, Filter, Search, Auto-Enhance

Preliminary specifications. Subject to change.

Page 25: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

TESLA FOR VISUALIZATION

IRAY

TESLA ACCELERATED COMPUTING

INDEX OPTIX

VISUALIZATION TOOLS FOR HPC

Page 26: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

GROWING ADOPTION IN CLIMATE & WEATHER

MeteoSwiss Deploys World’s First Accelerated Weather Supercomputer

2x higher resolution for daily forecasts

14x more simulation with ensemble approach for medium range forecasts

NOAA Chooses Tesla To Improve Weather Forecast Research

Develop global model with 3km resolution, five-fold increase from today’s resolution

Improved resolution requires 40x higher in computational complexity

Page 27: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

NEXT-GEN SUPERCOMPUTERS ARE GPU-ACCELERATED

SIMULATION

TESLA ACCELERATED COMPUTING

VISUALIZATION MACHINE LEARNING

SUMMIT

SIERRA

U.S. Dept. of Energy

Pre-Exascale Supercomputers for Science

IBM Watson

Breakthrough Natural Language Processing for Cognitive Computing

NOAA

New Supercomputer for Next-Gen Weather Forecasting

Page 28: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start

ENTERPRISE

World’s first in-situ weather simulation, running on Meteoswiss

supercomputer

Simulation of deadly tornado that hit El Reno,

Oklahoma on May 24, 2011

ACCELERATED SCIENCE AND DATA ANALYTICS ON DISPLAY AT SC’15

Analyze quantum effects in nanowire using 10,800 GPUs on

TITAN supercomputer

Predicting drug reactions for personalized medicine with

GPU-powered IBM Spark

Page 29: ACCELERATED COMPUTING: THE PATH FORWARDimages.nvidia.com/events/sc15/pdfs/SC5101-accelerated-computing-path... · ACCELERATED COMPUTING: THE PATH FORWARD “ It’s time to start