25
Enterprise Data Trends for Hybrid Multi-clouds The semiconductor crisis, data growth, and new opportunities with cloud, AI, and models © 2020 NetApp, Inc. All rights reserved. Naresh Patel Vice President & Chief Architect NetApp November 3, 2020

Enterprise Data Trends for Hybrid Multi-clouds

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Enterprise Data Trends for Hybrid Multi-clouds

Enterprise Data Trends for Hybrid Multi-cloudsThe semiconductor crisis, data growth, and new opportunities with cloud, AI, and models

© 2020 NetApp, Inc. All rights reserved.

Naresh PatelVice President & Chief ArchitectNetApp

November 3, 2020

Page 2: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 2

Overview

Tech

Trends

Global

Data

Servers

/Stor

age

CPU comple

x

Scale Cube

Move

Process

Stor

e

Layers × Lens

CRISIS

Page 3: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 3

• Data-focused interplay of the 3 things you can do with data: • Process it• Move it • Store it

• AI-infused decision making using real-time data vs. HW/SW designer making choices

• Opportunities for performance/energy models & metrics

Lenses

TECHNOLOGY TRENDS

GLOBAL DATA / STORAGE SCALING AND PROPERTIES

SERVER AND STORAGE SCALING

CPU / MEMORY COMPLEX

Layers

Page 4: Enterprise Data Trends for Hybrid Multi-clouds

How data gets processed, moved, and stored at the component level?

Process

StoreMove

New data can be processed, moved and stored. Semiconductors play a key role each.

© 2020 NetApp, Inc. All rights reserved. 4

SemiconductorMagneticOptical

SemiconductorCopper / Optical

SemiconductorOptical Semiconductor

Edge to core to multi-cloud

Multiple servers and storage

CPU-memory complex

Fundamental building blocks

Page 5: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 5

Page 6: Enterprise Data Trends for Hybrid Multi-clouds

2012 2014 2016 2018 2020 2022

0

50

100

150

200

250

Years

#transistors

(Mpersqmm)

© 2020 NetApp, Inc. All rights reserved. 6

• Moore’s Law “demands”:1. 2X transistors per wafer every 2 years2. Cost per million transistors goes down ~50% 3. Less power from smaller transistors

• But costs & power are not coming down as fast with process shrinks• Wafer fabrication costs are higher • Higher resistance of thinner traces reduce power benefits• Chips with “dark” silicon to handle physical limits

Costs per transistor no longer going down in leading edge semiconductor process

Microprocessor Cost Trends

2!/#

1000

2005 2010 2015 2020

0.0050.010

0.0500.100

0.5001

5

Years

Relativecost/transistor

10002!/#

<10% p.a. drop

> 50% p.a. drop

Source: Linley Group

Apple A14@ 5nm

3nm?

Page 7: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 7

• Wright’s Law: For every doubling of cumulative units produced, cost to manufacture one unit will fall by a constant percentage (factor 2$%).

As Moore’s Law slows down, Wright’s Law may provide a better fitWright’s Law – or “Experience” Power Law

𝑓 𝑥 = 𝑥$%𝑓(2𝑥) = 𝑓(𝑥)2$%

4 6 8 10 120.05

0.10

0.50

1

Years

Relativecostperunit

∝ 1/𝑡#

∝ 1/2!/#

If unit productiongrow linearly

If unit productiongrows exponentially

5 10 15 20

0.05

0.10

0.20

Cumulative Number of units produced

Relativecostperunit

• Unit costs depend on the unit production curve over time 𝑢(𝑡)• When unit production grows exponentially over time, the

unit cost using Wright’s Law over time is the same a Moore’s Law

• When unit production grows linearly, unit costs follows a power law over time

• Implications• Chip designs leveraging existing process node• Creates opportunities for scaling via new computer

architectures• New breakthrough?

Page 8: Enterprise Data Trends for Hybrid Multi-clouds

Moore’s Law scaling slows• Manufacturing improvements

continue• Better packaging: Chiplet

approach to overcome single monolithic die challenges for 3D scaling beyond 5nm

• Novel domain-specific architectures, esp. AI.

General purpose CPU becoming system bottleneck

• Composable architectures (including disaggregated resources) via SW APIs

• Scale-out architectures for public and private clouds

• Scale-up with HW accelerators and domain-specific architectures: GPUs, AI/ML,…

Persistent memory

• Memory-based computing provide synchronous access for simpler / efficient programming• Multi-TB scale

Processing inside memory

Emerging CPU and Memory TrendsGeneral purpose CPUs being augmented for speed and efficiency

© 2020 NetApp, Inc. All rights reserved. 8

Page 9: Enterprise Data Trends for Hybrid Multi-clouds

Emerging storage media trendsOngoing Flash $/GB reduction by adding layers, but new breakthrough needed to reduce cost per Flash cell

© 2020 NetApp, Inc. All rights reserved. 9

New approaches for differentiated workloads

Bring compute closer to media to take advantage of high internal BWWider adoption of AI/ML techniques for large immutable data sets

Cost/GB reductionFlash cost reduction via stacking layers continuesCost reduction from QLC with endurance trade-offNeed breakthrough media to challenge NL-SAS HDDs in $/GB

Low latency media (better latency than TLC NAND Flash)

As a read cacheAs a "write absorber”As network-attached replacement for server-side SSDs

Page 10: Enterprise Data Trends for Hybrid Multi-clouds

Data Access in Memory / Storage HierarchyStorage hierarchy choices getting crowded

© 2020 NetApp, Inc. All rights reserved. 10

Source: P. Faraboschi, “The Data Access Continuum”, SC’19 MCHPC

Tape

Page 11: Enterprise Data Trends for Hybrid Multi-clouds

Emerging Networking TrendsNetwork scaling continues upward trend

© 2020 NetApp, Inc. All rights reserved. 11

Link speeds continue upward trend beyond 100 GbE

Switch ports going to 400 GbE, using 4 lanes of 100G SerDesPublic clouds creating faster transitions to next generationNext step is 800 GbE

Smart NICs with processing capability

Offer HW offload capability and relatively easy programmabilityDirect placement benefits of RDMA on client and storage sideSecurity features

NVMe-oF for storage connectivity scaling well to 1000’s of nodes

RoCEv2, TCP-IP for most robust transport layer

Page 12: Enterprise Data Trends for Hybrid Multi-clouds

How data gets processed, moved, and stored across the Edge, Core, Multi-Cloud

Process

StoreMove

How might we find the “best home” for a workload across Edge, Core, and Multi-cloud?

© 2020 NetApp, Inc. All rights reserved. 12

Most data created at the edgeEndpoints, CloudletsCore for sovereigntyIncreased cloud storage for consumers

Data fabric to allow movementacross edge, core, multi-cloudMove the right data, to theright place, at the right time!

Apps at the edge for real-timeApps in core DC for security/complianceApps in cloud for scale, cloud nativeContainers and serverless FaaS

Edge to core to multi-cloud

Multiple servers and storage

CPU-memory complex

Fundamental building blocks

Page 13: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 13

Data growth continues in the amount created and the amount storedGlobal Data Creation and Installed Storage

All Data created in 2020

2020 2024201620122008

“Queue Length” =Data in installed base

HDDSSDNVMOpticalTape

“Data Arrival Rate”Increasing -+

Page 14: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 14

Post-processing the evolution of drives provides useful insights across the population. Real-time analysis of collective drives’ lifetime of experience enables better prediction and decision making.

Example: Evolution of “data reads per day” on a set of drives

0

0.5

1

1.5

2

2.5

3

3.5

4

0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5

SSD

Driv

e Re

ads P

er D

ay

Power On Years for Drive

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5 5.25 5.5

SSD

Driv

e Re

ads P

er D

ay

Power On Years for Drive

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

Page 15: Enterprise Data Trends for Hybrid Multi-clouds

How data gets processed, moved, and stored across server nodes?

Process

StoreMove

How might we find the right balance of processing, movement and storage for my workload?

© 2020 NetApp, Inc. All rights reserved. 15

Rapid shift to SSDs for high IOPSUnlock internal memory/SSD BW

Data movement getting fasterPCIeG4/5 or EthernetLess movement saves energy

Bottleneck for many workloadsProcess in memory / storage / networkDistribute processing across servers

Edge to core to multi-cloud

Multiple servers and storage

CPU-memory complex

Fundamental building blocks

Page 16: Enterprise Data Trends for Hybrid Multi-clouds

What Type of Processing is being Offloaded from x86 CPU?Pensando Distributed Services Card: 100 GbE Networking and easy programmable offloads, but with ASIC-like speeds

16 © 2020 NetApp, Inc. All rights reserved. Naresh Patel

x86 CPUStorage Controller

PensandoPCIe CardARM Cores P4 Engines HW Engines 100 GbE

Shared Memory

BULK DATA & TRANSFORMSwith DATA INTEGRITY

Process executing x86 Instructions

Page 17: Enterprise Data Trends for Hybrid Multi-clouds

k=1

k=2

k=3

��� ��� ��� ��� ��� ����

��

��� ��� ����������(�)

How does the Processing Offload to Pensando help Performance?Moving Bulk Data & Transforms from Main Memory to Pensando card: Reduces CPU usage and latency

17 © 2020 NetApp, Inc. All rights reserved. Naresh Patel

Storage CPU cores

MainMemory

PensandoCard Network

PCIe

LLC

BULK DATA &TRANSFORMS

Offloading Bulk Data Transforms:• Saves CPU cycles directly• Reduces memory BW• Helps maintain consistent latency

Load

Rel

ativ

e La

tenc

y

Half-Latency

Knee

Higher variability →

higher latency

100GbE

Page 18: Enterprise Data Trends for Hybrid Multi-clouds

Y-axis, Vertical Scaling: Offload “Bulk Data & Transforms” for Storage

• Storage Efficiency• Compression, Compaction• Deduplication / Hash

• Data Protection• Checksums• RAID operations

• Security computation• Encrypt data at rest• Encrypt data in motion

• Networking• NIC, RDMA

© 2020 NetApp, Inc. All rights reserved. Naresh Patel18

“Data touch” / CPU-intensive functionsmove to specialized HW on “Fabric”ÞPensando’s card with P4+ engines, HW

engines, and ARM coresBenefits compound via ONTAP software …ÞPrimary data snapshots, clones on boxÞPreserve storage efficiency for secondary

data servicesÞTiering data to Public Clouds

D

c4c3c2c1

Page 19: Enterprise Data Trends for Hybrid Multi-clouds

Z-Axis, Scale-out for High-Tech Industry Applications

Linear scale for performance and capacity

Operational simplicity• Single mount point with automated load

and space distribution

Consistent high performance• Predictable, consistent low latency• All-flash containers

High resiliency• NetApp® ONTAP® nondisruptive operations

FlexGroup on A400 with Pensando for clustering: a scalable, high-performance data container

19 © 2020 NetApp, Inc. All rights reserved. Naresh Patel

Single namespace

Apps for electronic design automation,high-tech, oil and gas,

media and entertainment

d2

C

d1

Cz-axis

d4

C

d3

C

Page 20: Enterprise Data Trends for Hybrid Multi-clouds

X-axis, Horizontal Scaling - Accelerate Your Global NamespaceFlexCache volumes distributes hot data over Pensando cluster ports

20

NetApp® FlexCache® volumes • Sparsely populated volumes that can be cached on the same

cluster or a different cluster as the origin volumes to accelerate data access

Performance acceleration for hot volumes• Cache read and metadata for CPU-intensive workloads• Provides different mount points to avoid hot volumes• Cache data within the cluster (intra-cluster)

Cross data center data distribution• Cache across multiple data centers to reduce WAN latencies• Bring data closer to compute and users• Between Netapp AFF, FAS, or ONTAP Select systems

© 2020 NetApp, Inc. All rights reserved. Naresh Patel

D

C

D

Cx-axis

D

C

D

C

Page 21: Enterprise Data Trends for Hybrid Multi-clouds

Scale in All 3 Dimensions – System & DataScalability = Ability to handle a growing amount of work (CPU and Data)

21 © 2020 NetApp, Inc. All rights reserved. Naresh Patel

One system & datamonolithic architecture

y-axis Functional DecompositionScale by splitting into different thingsComponents: Modules/Microservices

Split by function, service,data affinity

Source: the Scale Cube http://theartofscalability.com/

D

C

Many nodes each a clone, load balancedData reads from replicas; write to one node

x-axis Horizontal Technical DuplicationScale by cloning pieces of dataTechnical Architecture Layering

z-axis Data Partit

ioning

Scale by splitti

ng into similar co

ntexts / s

hards

No splits

Split by users, large modules, hash,...

Near Infinite Scale

D

C

d2

C

d1

C

D

Cx-axis

z-axis

D

C

D

C

d4

C

d3

C

y-axis

D

c4c3c2c1

Page 22: Enterprise Data Trends for Hybrid Multi-clouds

How data gets processed, moved, and stored inside the CPU-Memory Complex

Process

StoreMove

How might we get more performance and energy efficiency out of the CPU and memory?

© 2020 NetApp, Inc. All rights reserved. 22

Memory bound by latencyPersistent memory for capacityCXL for large memory spaces

Need energy focusData hops are “hidden”Perf and energy metrics

Bottleneck for many workloadsDecentralize processingComputational memory

Edge to core to multi-cloud

Multiple servers and storage

CPU-memory complex

Fundamental building blocks

Page 23: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 23

• Many data center workloads have low instructions per cycle• Back-end bound → waiting to fill caches / memory latency• Dependent data accesses → serial memory accesses

• Characteristics of data intensive workloads• Request and response over the network• Simultaneously access many contexts (fan in)• Areas of ”tax”: Protocol buffers, RPCs, memory moves,

compression, memory allocation, hashing

• Memory latencies going higher with bigger sockets• Moving these ”data intensive” functions away from CPU to

ethernet fabric-attached offload HW• Move processing into memory for better latency and energy

efficiency• Symmetric Multi-Threaded cores

“100% CPU Utilization” hides the underlying bottlenecks – cache hierarchy and memory latency. CPU Pipeline is stalled for >80% of the time for many workloads!

Source: S. Kanev et al. Profiling a warehouse-scale computer, ICSA’15

Page 24: Enterprise Data Trends for Hybrid Multi-clouds

MB/G/1 Energy-Performance Queue with Batched Arrivals

24 © 2020 NetApp, Inc. All rights reserved.

Service Time

Random Arrivals with rate λ of batched requestsSpecified batch size distribution (Pr(1), Pr(2), Pr(3),…)

Specify n general service time distributions- Different for each queue depth (1, 2, 3,…,n)- Same service time distribution for queue length n, n+1,..

System powers-down when queue is empty (power-down time distribution)

System powers-up on arrival of new item to empty queue (power-up time distribution)

MODEL INPUT: arrival rate, 1 discrete distribution, n+2 continuous distributions (n service times, POWER-UP, and POWER-DOWN times) and power usage for ON, OFF. MODEL OUTPUT: response time distribution (means, variance, and higher moments), energy usage

Response time or Latency

Example: EP Queue + Reinforcement Learning to make power usage choices (and service times) with energy constraints

Page 25: Enterprise Data Trends for Hybrid Multi-clouds

© 2020 NetApp, Inc. All rights reserved. 25

Summary

Tech

Trends

Global

Data

Servers

/Stor

age

CPU comple

x

Scale Cube

Move

Process

Stor

e

Layers × Lens

CRISIS

AI-infus

ion

Renais

sanc

e

Perform

ance

&

Energy

metr

ics