23
Bringing ‘Intelligence’ to Enterprise Storage Drives Neil Werdmuller Director Storage Solutions Arm Flash Memory Summit 2018 Santa Clara, CA 1

Bringing 'Intelligence' to Enterprise Storage Drives

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bringing 'Intelligence' to Enterprise Storage Drives

Bringing ‘Intelligence’ to Enterprise Storage Drives

Neil WerdmullerDirector Storage Solutions

Arm

Flash Memory Summit 2018Santa Clara, CA 1

Page 2: Bringing 'Intelligence' to Enterprise Storage Drives

Who am I?

• 28 years’ experience in embedded• Lead the storage solutions team• Work closely with the industry’s top storage suppliers• Previously in wireless at Texas Instruments• BSc in Computer Science from Portsmouth University (UK)• I enjoy brewing beer at home!

Flash Memory Summit 2018Santa Clara, CA 2

Page 3: Bringing 'Intelligence' to Enterprise Storage Drives

What will we cover today?

• What benefit does in-storage compute bring• What is needed for in-storage compute• Ecosystem support available• Machine Learning in-storage

Flash Memory Summit 2018Santa Clara, CA 3

Page 4: Bringing 'Intelligence' to Enterprise Storage Drives

Flash Memory Summit 2018Santa Clara, CA 4

21BnArm-based

chips shipped in 2017

#1shipping

processor in storage devices

> 5Bnpeople using Arm-based

mobile phones

120BnArm-based

chips to date

Arm computing is everywhere

Page 5: Bringing 'Intelligence' to Enterprise Storage Drives

Bandwidth ReliabilityPower SecurityCost Latency

Flash Memory Summit 2018Santa Clara, CA 5

Why computation is moving to storage

Page 6: Bringing 'Intelligence' to Enterprise Storage Drives

Flash Memory Summit 2018Santa Clara, CA 6

1. Compute waits for data• Takes time to move data across fabric

2. Adds latency• Multiple layers of interface and protocols• Data copied many times• Bottlenecks often exist

3. Consumes bandwidth and power• Moving data is expensive• Data copies increase system DRAM

Moving data to compute

1. Request data from storage

2. Move data to compute

4. Move results to storage

3. Compute

Page 7: Bringing 'Intelligence' to Enterprise Storage Drives

Flash Memory Summit 2018Santa Clara, CA 7

1. Compute happens on the data• Moved from flash to in-drive DRAM and processed

2. Lowest possible latency• No additional protocols – just flash to DRAM

3. Minimum bandwidth and power• Data remains on the drive – only results delivered

4. Data centric processing• Workloads specific to the computation deployed to the

drive

5. Security• Unencrypted data does not leave the drive

1. Request operation

3. Return result

2. Compute

In-storage compute

Page 8: Bringing 'Intelligence' to Enterprise Storage Drives

Compute:• Frontend: Host I/F + Flash Translation Layer

• Cortex-R or Cortex-A series• Backend: Flash management

• Cortex-R or Cortex-M series• Accelerators:

• Encryption, LDPC,…• Arm NEON, ML, FPGA…

Memory: DRAM ~1GB for each 1TB of flashStorage: 256GB to 64TB… flash storageInterfaces: PCIe/SATA/SAS…

Flash Memory Summit 2018Santa Clara, CA 8

DRAMCache + FTL Tables

Frontend Processor(s)

Host Control Flash Translation Layer

PCIe/ SATA/

SAS

Encr

yptio

n

Flas

h Pr

oces

sor C

ontr

ol

Cache / BufferManagement

Com

pres

sion

MVM

eCo

mm

and

Que

ues

Host

Con

trol

Flash SystemManagement

Block Management Wear Levelling, Bad

Block mapping...

Host / FlashAddress Translation

Backend Processor

Backend Processor

Garb

age

colle

ctio

n,

ECC/

LDPC

Err

orCo

rrec

tion/

Dete

ctio

n,Re

ad S

crub

bing

...

NAN

D Fl

ash

NAN

D Fl

ash

NAN

D Fl

ash

NAN

D Fl

ash

CLEALECEREWEWPR/BIDin

IDout

CLEALECEREWEWPR/BIDin

IDout

CLEALECEREWEWPR/BIDin

IDout

CLEALECEREWEWPR/BIDin

IDout

Compute in SSD controllers

Garb

age

colle

ctio

n,

ECC/

LDPC

Err

orCo

rrec

tion/

Dete

ctio

n,Re

ad S

crub

bing

...

SSD SoC Functionality:

Mem

ory

Cont

rol

Mem

ory

Cont

rol

Mem

ory

Cont

rol

Mem

ory

Cont

rol

Page 9: Bringing 'Intelligence' to Enterprise Storage Drives

Application processor to run a HLOS• Runs high-level OS through a memory management unit• Linux for Open Source software stacks• All major Linux distribution run on Arm

• Networking protocol stacks: Ethernet, TCP/IP, RDMA…• Linux workloads:• NVMe-oF, databases, file-systems, SDS, custom applications,…• Containerization for workload deployment and portability

Accelerators for specific workloads or for Machine Learning (ML)• Potentially combined with additional accelerators: Custom hardware, ML, FPGA, GPU, DSP…

Custom workloads can be run without apps processor, but complex to develop/deploy

Flash Memory Summit 2018Santa Clara, CA 9

What is needed for in-storage compute?

Page 10: Bringing 'Intelligence' to Enterprise Storage Drives

Separate Cortex-A series processor• Enables any SSD (or HDD) to run Linux• Wide performance range from Cortex-A5…Cortex-A76

Single SoC for cost/latency reduction • Lower latency by removing internal (PCIe) interface• Separation of apps processor and the SSD processing• Shared DRAM and other SoC resources

Combined into frontend/apps processor• Hypervisor provides SSD frontend separation from Linux• Lowest cost and tightest integration• Lowest possible latency• Highest internal bandwidth

Flash Memory Summit 2018Santa Clara, CA 10

In-storage compute evolution

Net

wor

k In

terf

ace

PCIe

Inte

rnal

Inte

rfac

eNV Memory

DRAM

Applications Processor

PCIe

Inte

rnal

Inte

rfac

e DRAM

Frontend Controller Processor

Backend Processor

NV Memory

Backend Processor

Backend Processor

NV Memory

NV Memory

Net

wor

k In

terf

ace

DRAM

Frontend Controller Processor

Backend Processor NV

Memory

Backend Processor

Backend Processor

NV Memory

NV Memory

Applications Processor

Net

wor

k In

terf

ace

DRAM

Combined Applications

and Frontend Processor

Backend Processor NV

Memory

Backend Processor

Backend Processor

NV Memory

NV Memory

Page 11: Bringing 'Intelligence' to Enterprise Storage Drives

Scalability of compute• From a single, low-power core to multiple clusters of

high-performance cores

Flexibility• One SSD SoC that is suitable for: • In-storage compute, Edge SSD, NVMe-oF,…

Security• TrustZone isolates Linux and SSD functionality• Processing of data is all done on the drive• Decrypted data remains on the drive

Flash Memory Summit 2018Santa Clara, CA

In-storage compute:

11

Netw

ork

Inte

rfac

e

DRAM

Frontend Controller Processor

Backend Processor NV

Memory

Backend Processor

Backend Processor

NV Memory

NV Memory

Applications Processor

Netw

ork

Inte

rfac

e

DRAM

Combined Applications

and Frontend Processor

Backend Processor NV

Memory

Backend Processor

Backend Processor

NV Memory

NV Memory

The benefits of in-storage compute

Page 12: Bringing 'Intelligence' to Enterprise Storage Drives

Flash Memory Summit 2018Santa Clara, CA 12

1. Compute happens on the data• Moved from flash to in-drive DRAM and processed

2. Lowest possible latency• No additional protocols – just flash to DRAM

3. Minimum bandwidth and power• Data remains on the drive – only results delivered

4. Data centric processing• Workloads specific to the computation deployed to the drive

5. Security• Unencrypted data does not leave the drive

1. Request operation

3. Return result

2. Compute

In-storage compute

Page 13: Bringing 'Intelligence' to Enterprise Storage Drives

13

www.linaro.org

Flash Memory Summit 2018Santa Clara, CA

Linux ecosystem on Arm

Page 14: Bringing 'Intelligence' to Enterprise Storage Drives

Flash Memory Summit 2018Santa Clara, CA 14

A few ‘Works on Arm’ partners

www.worksonarm.com

Page 15: Bringing 'Intelligence' to Enterprise Storage Drives

Flexible, Scalable ML Solutions

Mali GPUs

Arm NPUs

Cortex-M/A CPUs

ML p

erfo

rman

ce (o

ps/s

econ

d)

ML capabilities

Keyword detection

Pattern training

Voice & image recognition

Smart cameras

Image enhancement

Autonomous driving

Data centerTypical ML hardware choice

Deliver use cases with multiple hardware solutions Choose best balance of ML performance versus capabilities per use case

Only Arm can enable ML everywhere

Page 16: Bringing 'Intelligence' to Enterprise Storage Drives

Training data

Model

For each piece of data used to train the model, millions of model parameters are adjusted.The process is repeated many times until the model delivers satisfactory performance.

Neural Network

Flash Memory Summit 2018Santa Clara, CA 16

Machine Learning ‘training’

Page 17: Bringing 'Intelligence' to Enterprise Storage Drives

û

üInput Neural Network

of modelOutput

96.4% confidence

97.4% confidence

Flash Memory Summit 2018Santa Clara, CA 17

Machine Learning ‘inference’

When new data is presented to the trained model, large numbers of multiply-add operations are performed using the new data and the model parameters. The

process is performed once

Page 18: Bringing 'Intelligence' to Enterprise Storage Drives

18Flash Memory Summit 2018Santa Clara, CA

Project Trillium: Arm’s ML computing platform

Page 19: Bringing 'Intelligence' to Enterprise Storage Drives

Flash Memory Summit 2018Santa Clara, CA

Optimized low-level functions for CPU and GPU

• Most popular CV and ML functions

• Supports common ML frameworks

• Over 80 functions in all

• Quarterly releases

• CMSIS-NN separately targets Cortex-M

Enable faster deployment of CV and ML

• Targeting CPU (NEON) and GPU (OpenCL)

• Significant performance uplift compared to OSS alternatives (up to 15x)

Key Function Categories

Neural network

Convolutions

Colour manipulation

Feature detection

Basic arithmetic

GEMM

Pyramids

Filters

Image reshaping

Mathematical functionsPublicly available now (no fee, MIT license)developer.arm.com/technologies/compute-library

19

Arm Compute Library

Arm NN

Compute Library

TensorFlow / Caffe etc.

Application

Cortex-A Mali Arm ML Processor

Page 20: Bringing 'Intelligence' to Enterprise Storage Drives

Enterprise SSD already has considerable compute performance• Cortex-A series already adopted by some Arm partners

In-storage compute delivers with low-cost, low-power and lowest-latency

Machine Learning use cases growing rapidly

In-storage compute and Edge SSD opens up many possibilities• Please download this presentation• COMP-301-1: “Bringing Intelligence to Enterprise Storage Drives”• If you missed my first talk on Tuesday please download the presentation• ARCH-102-1: “Transforming an SSD into a Cost-Effective Edge Server”

Flash Memory Summit 2018Santa Clara, CA 20

Bringing Intelligence to SDD

Page 21: Bringing 'Intelligence' to Enterprise Storage Drives

For more information, visit storage.arm.com.

[email protected]/nwerdmuller

Flash Memory Summit 2018Santa Clara, CA 21

To learn more…

Page 22: Bringing 'Intelligence' to Enterprise Storage Drives

2222 © 2018 Arm Limited

Thank You!Danke!Merci!��!�����!Gracias!Kiitos!감사합니다ध"यवाद

Page 23: Bringing 'Intelligence' to Enterprise Storage Drives

2323

The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.

www.arm.com/company/policies/trademarks Flash Memory Summit 2018Santa Clara, CA