23
2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 1 Origin of Million IOPs throughput and microsecond latencies in NVMe Enterprise SSDs Andrei Konan Enterprise SSD Firmware Engineer

Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 1

Origin of Million IOPs throughputand microsecond latenciesin NVMe Enterprise SSDs

Andrei KonanEnterprise SSD Firmware Engineer

Page 2: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 2

Topics to cover Introduction SSD Generic Architecture

NAND Flash basics Over-provisioning Read and Write Operation Flow

Workloads definition Bottlenecks Affecting Enterprise features Performance calculator

Page 3: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 3

Abstract

Creation of new SSD product is driven by two main aspects: cost reduction and performance increase. Presentation will cover second major aspect of new SSD development – Performance and Quality-of-Service (QoS) improvements. Having SSD cost at the same level as competitors leaves only single opportunity to win the business – better throughput, consistency and latency. Nowadays Enterprise SSDs have broken Million IOPs throughput level and advertising microsecond latencies for some workloads.

Full text paper can be found at the following link https://www.linkedin.com/posts/andrejkonan_origin-of-million-iops-throughput-for-nvme-activity-6582879656997859328-Y4aO

Page 4: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 4

Introduction

Page 5: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 5

Introduction Two main aspects of new product development

Cost reduction

Performance improvement

PCIe 4.0 speed is 8GB/s (2 million IOPs of 4KB size)

Majority of SSDs barely meet even PCIe 3.0 speeds

Page 6: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 6

SSD Generic Architecture

Page 7: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 7

NAND Flash basics – operation Page based Program, Block based Erase

TLC Physical Block example

Wordline 0

Wordline 1

Wordline 2

Wordline 0

Wordline 1

Wordline 2

Page 0Page 1Page 2

Page 3Page 4Page 5

Page 6Page 7Page 8

Program Erase

Wordline 0

Wordline 1

Wordline 2

Page 0Page 1Page 2

Page 3Page 4Page 5

Page 6Page 7Page 8

One-shot

Page 8: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 8

NAND Flash basics – geometry Multiple working in parallel units (Planes) in one chip NAND Chip

CE 0 CE 1

Die 0 Die 1 0 1

Plane 0 Plane 1 Plane 2 Plane 3

Physical Block Can Read and Program in Parallel with some limitations

Page 9: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 9

NAND Flash basics – metrics Page Read (tR) = 50us One-Shot TLC Page Program (tPROG) = 2000us Block Erase (tBERS) = 6000us

QoS (Latency) related features: Program / Erase Suspend for Read Plane Interleaving for Read

Page 10: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 10

Over-provisioning – definition A. Flash chip size is fixed by NAND Vendor

B. Drive external capacity is fixed by IDEMA standard

C. (A) – (B) is SSD over-provisioning (OP)

Read Intensive Drives OP is ~10%

Mixed Workload Drives OP is ~30%

Page 11: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 11

Over-provisioning – impact Bigger OP = Less garbage collection (GC) in background Less GC = More time to Handle Host Write Traffic GC vs Host Write Traffic ratio is named Write Amplification (WA)

Read Intensive Drive (OP~10%): WA is ~ 4:1 Mixed Workload Drive (OP~30%): WA is ~ 2:1

ObsoleteValidErased

GarbageCollection

Page 12: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 12

Write and Read Operation

Page 13: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 13

Write Operation Flow

Host SSD

NAND

NAND

NAND

SRAM DRAM

Front End

Flash Translation Layer BackEnd

Controller

PCIe

Cache

(1) Write

(2) Status

(3) Flush

Power Loss Protection POSCAPs

GarbageCollector

(4) GC

ECC

Page 14: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 14

Read Operation Flow

Host SSD

NAND

NAND

NAND

SRAM DRAM

Front End

Flash Translation Layer BackEnd

Controller

PCIe

(1) Host Read

(3) Status

Power Loss Protection POSCAPs

ECC

(2) NAND Read

(2’) Retry

Page 15: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 15

Workloads and Bottlenecks

Page 16: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 16

Workloads definition – parameters Queue Depth – maximal amount of SSD outstanding commands

Block Size – 128KB for Sequential access, 4KB for Random

Access Type – Read, Write, Trim, Mixed

Drive State – Fresh (Formatted), Steady (Sustained)

Examples

QD256 4K RND WR

QD1 128K SEQ RD

Page 17: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 17

Workloads definition – metrics Bandwidth / Throughput

Sequential - GB/s, Random - KIOPs/MIOPs

Consistency Enterprise requires over 95%

QoS (Latency) Average Low nines: 99%, 99.9%, 99.99% High nines: 99.999%, 99.9999%, 99.99999%

Page 18: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 18

Bottlenecks Host interface (PCIe, SATA, SAS) ECC encoder / decoder bandwidth DRAM interface speed NAND operation (tPROG, tR, tBERS) NAND Endurance (Retry Count) NAND interface speed NAND Channel and Die count Internal CPU instruction bandwidth Enterprise Feature

Page 19: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 19

Affecting Enterprise Features Power Loss Protection

Power Consumption

Consistent Performance is more preferred than Peak one

User Data Reliability and Security is of top priority

Attention up to 10-nines QoS requirements

Extra high densities (up to 64TB)

Overheating

Page 20: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 20

Performance calculator

Page 21: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 21

Performance calculator – Sequential

HostFront End

Back End

NAND

Controller

PCIe Gen46.5 GB/s

500MHz, 8b8 GB/s

tPROG : 2ms98KB(*) / tPROG128 NAND Dies

6.2 GB/s

ECC

X GB/s

HostFront End

Back End

NAND

Controller

PCIe Gen46.5 GB/s

500MHz, 8b8 GB/s

tR : 50us32KB(**) / tR

128 NAND Dies>10 GB/s

ECC

X±50% GB/s

Sequential Write

Sequential Read

(*) – 8KB physical page * 4 planes * 3 pages per one-shot, (**) – 8KB physical page * 4 planes

Page 22: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 22

Performance calculator – Random

HostFront End

Back End

NAND

Controller

PCIe Gen41.7 MIOPs

500MHz, 8b2 MIOPs

400 KIOPs

tPROG : 2ms98KB(*) / tPROG128 NAND Dies

1.6 MIOPs320 KIOPs

ECC

Y MIOPs

HostFront End

Back End

NAND

Controller

PCIe Gen41.7 MIOPs

500MHz, 8b2 MIOPs

tR : 50us4KB(**) / tR

128 NAND Dies2.5 MIOPs

ECC

Y±50% MIOPs

4KB Random Write

4KB Random Read

(*) – 8KB physical page * 4 planes * 3 pages per one-shot, (**) – single 4KB Host Read will occupy entire Die

Cache GC

Only 20% throughput is for Host Write, 80% for GCWrite Amplification is 1:4

< 1.5 MIOPs

Page 23: Origin of Million IOPs throughput and microsecond latencies ......(*) – 8KB physical page * 4 planes * 3 pages per one- shot, (**) – single 4KB Host Read will occupy entire Die

2019 Storage Developer Conference. © Andrei Konan. All Rights Reserved. 23

Thanks a lot!Q & A