52

PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 2: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 3: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 4: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

2013 2014 2015 2016

Page 5: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 6: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 7: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

Source: Bob Broderson, Berkeley Wireless group

Page 8: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

CPUs

ManycoreCPUs

GPUs

FPGAs

StructuredASICS

CustomASICs

1X

3X

5-30X

5-30X

20-100X

> 100X

Today’s standard, most programmable, good for services changing rapidly

Many simple cores (10s to 100s per chip), useful if software can be fine-grain parallel, difficult to maintain.

Good for data parallelism by merged threads (SIMD), High memory bandwidth, power hungry

Most radical fully programmable option. Good for streaming/irregular parallelism. Power efficient but currently need to program in H/W languages.

Highest efficiency. Highest NRE costs. Requires high volume. Good for functions in very widespread use that are stable for many years.

Lower-NRE ASICs with lower performance/efficiency.Includes domain-specific (programmable) accelerators.

Perf/WMoreFlexible

MoreEfficient

Conventional programming

Alternativeprogramming

Can’t changefunctionality

Page 9: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

• Field Programmable Gate Array• FPGAs are a sea of generic logic

and interconnect• “Silicon Legos” – build them into exactly

the right circuit for each task• Special-purpose hardware (FPGAs)

is faster and more efficient than general-purpose hardware (CPUs)

• Change the hardware anytime!• 100 ms to 1 second reconfiguration time

Page 10: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

DRAM

PCIe 0

PCIe 1

40G Network

Page 11: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

ASICsFPGAs

Source: Bob Broderson, Berkeley Wireless group

Page 12: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

Software FPGA ASIC

Page 13: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 14: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

ALL WITH FPGAs!

Page 15: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 16: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 17: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 18: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 19: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 20: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

• Altera Stratix V GS D5• 172k ALMs, 2,014 M20Ks, 1,590 DSPs

• 8GB DDR3-1333• 32 MB Configuration Flash

• PCIe Gen 3 x8• 8 lanes to Mini-SAS SFF-8088 connectors

• Powered by PCIe slot

Stratix V

8GB DDR3

PCIe Gen3 x8

4x 20 Gbps Torus Network

ConfigFlash

Page 21: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

Data Center Server (1U, ½ width)

Page 22: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 23: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

1,632 Servers with FPGAs Running Bing Page Ranking Service (~30,000 lines of C++)

More compute time for improving relevance

Reduced # of servers

Page 24: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 25: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

2.4+ millionemails per day

200+ Cloud Services: Diversity1+ billion customers · 20+ million businesses · 90+ markets worldwide

5.8+ billionworldwide queries each month

1 in 4enterprise customers 50+ billion

minutes of connections handled each month

48+ millionusers in 41 markets

50+ millionactive users

400+ millionactive accounts

250+ millionactive users

8.6+ trillionobjects in Microsoft Azure

storage

Page 26: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 27: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 28: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

40G NIC and TOR

FPGA

4GB DDR

40 Gb NIC

Catapult v2 FPGA Card

CPU

PCIe

Gen

3 x

8

PCIe

Gen

3 x

8

PCIe

Gen

3 x

8

ToR

Page 29: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

CPU

NIC

PCIe Gen 3 PCIe Gen 3

Compute Acceleration

Page 30: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

CPU

NIC40G Ethernet

PCIe Gen 3 PCIe Gen 3

Compute AccelerationNetwork AccelerationHardware as a Service

Page 31: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

0

5

10

15

20

25

1 10 100 1000 10000 100000 1000000

Roun

d-Tr

ip L

aten

cy (u

s)

LTL L0 (same TOR)

LTL L1

Example L0 latency histogram

Example L1 latency histogram

Examples of L2 latency histograms for different pairs of FPGAs

Number of Reachable Hosts/FPGAs

Catapult Gen1 Torus(can reach up to 48 FPGAs)

LTL Average Latency

LTL 99.9th Percentile

6x8 Torus LatencyLTL L2

10K 100K 250K

Page 32: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 33: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 34: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 35: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

2013 2014 2015 2016

Page 36: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

DNNs

Compression

Encryption

Page 37: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

99.9% software latency

99.9% FPGA latency

average FPGA query load average software load

Day 1 Day 2 Day 3 Day 4 Day 5

1.0

2.0

3.0

4.0

5.0

6.0

7.0

Nor

mal

ized

Load

& L

aten

cy

Page 38: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

Xpress8 L2 (5.6GB/s)30x throughput20% compression lossln-line compression

Xpress9 L6 (300MB/s)4x throughputNo compression lossShort/mid-term data

LZMA (5MB/s)5x throughput5% compression lossLong-term storage

*- Measured on Canterbury dataset

Page 39: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

NIC

VFP

Hypervisor

Guest OS

VM

NIC

VM

GFT/FPGA

Page 40: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 41: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 42: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 43: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 44: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 45: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 46: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 47: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

99.9% software latency

99.9% FPGA latency

Day 1 Day 2 Day 3 Day 4 Day 5

1.0

2.0

3.0

4.0

5.0

6.0

7.0

Nor

mal

ized

Load

& L

aten

cy

And ruinerof accelerator dreams

Page 48: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 49: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 50: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM

http://research.microsoft.com/catapult

Page 51: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM
Page 52: PowerPoint Presentation 2017... · 2017-12-15 · PowerPoint Presentation Author: Andrew Putnam Subject: Microsoft Research 2013 Created Date: 4/22/2017 11:56:21 AM