From Open-Channel SSDs to STORAGE… · Matias Bjørling, Director, Western Digital STORAGE. This...

Preview:

Citation preview

From Open-Channel SSDs to Zoned Namespaces

Matias Bjørling, Director, Western Digital

STORAGE

This presentation contains forward-looking statements that involve risks and uncertainties, including, but not limited to, statements regarding

our solid-state technologies, product development efforts, software development and potential contributions, growth opportunities, and

demand and market trends. Forward-looking statements should not be read as a guarantee of future performance or results, and will not

necessarily be accurate indications of the times at, or by, which such performance or results will be achieved, if at all. Forward-looking

statements are subject to risks and uncertainties that could cause actual performance or results to differ materially from those expressed in

or suggested by the forward-looking statements.

Key risks and uncertainties include volatility in global economic conditions, business conditions and growth in the storage ecosystem, impact

of competitive products and pricing, market acceptance and cost of commodity materials and specialized product components, actions by

competitors, unexpected advances in competing technologies, difficulties or delays in manufacturing, and other risks and uncertainties listed

in the company’s filings with the Securities and Exchange Commission (the “SEC”) and available on the SEC’s website at www.sec.gov,

including our most recently filed periodic report, to which your attention is directed. We do not undertake any obligation to publicly update or

revise any forward-looking statement, whether as a result of new information, future developments or otherwise, except as required by law.

Forward-Looking StatementsSafe Harbor | Disclaimers

Open-Channel SSDs

4

I/O Isolation Predictable Latency Data Placement &I/O Scheduling

Ubiquitous WorkloadsEfficiency of the Cloud requires many different workloads to a single SSD

5

Databases Sensors Analytics Virtualization Video

Solid-State Drive

Open-Channel SSD ConceptsChunks

• Write sequentially within an LBA range

• Requires reset for rewrites

• Borrows from HDD’s SMR specification (ZAC/ZBC)

• Optimized for SSD physical constraints

• Align writes to media layout

Parallel Units

• Host can direct I/Os to separate workloads

• Stripes across single or multiple dies.

• The parallel units inherits the throughput and latency characteristics of the underlying media

• Served by I/O determinism (NVMeTM 1.4)

Chunks & Parallel Units

Application-1 Application-2 Application-3

SSD Firmware

Flash

Flash

Application-1FTL-1

Application-2FTL-2

Application-3FTL3

SSD Firmware

Chunk 0 Chunk 1 Chunk 2 Chunk 3 Chunk X

Disk LBA range divided in chunks

Industry Adoption

Hyper-scalers, all-flash array vendors, and large storage system vendors that

have been considering or uses Open-Channel SSD architectures can now

benefit from standardization and a broader eco-system.

Key concepts to be introduced into the NVMeTM specification

Zoned Namespaces (ZNS)Technical Proposal in the NVMeTM working group

Standardizes zone interface as an approach to:

• Reduce device-side write amplification

• Reduce over-provisioning

• “Note that excessive over-provisioning is similar to early replacement -- in both cases you buy more devices.”

- Mark Callaghan, Facebook

• Reduce DRAM in SSDs

• Highest cost after NAND itself

• Improve latency outliers and throughput

• Reduces device-side data movement

• The tail at scale

• Enable software eco-system.

• Everyone benefits from software improvements!

8

Zoned Namespaces similar to ZBC/ZAC for SMR HDDs

• Storage capacity is divided into zones

• Each zone is written sequentially

• Interface optimized for SSDs

• Align with media characteristics

• Zone size aligned to media (E.g., NAND block sizes)

• Zone capacity aligned to physical media sizes

• Reduce NAND media erase cycles (Write amp.)

Zone 0 Zone 1 Zone 2 Zone 3 Zone X

Write pointer

position

Disk LBA range divided in zones

Write commands

advance the write

pointer

Reset write pointer commands

rewind the write pointer

Zone Information

Zone State

• Empty, Implicitly Opened, Explicitly Opened, Closed, Full, Read Only, Offline

• Empty -> Open -> Full -> Empty -> ….

Zone Reset

• Full -> Empty

Zone Size & Zone Capacity

• Zone Size is fixed

• Zone Capacity is the writeable area within a zone

Zone Capacity (E.g., 500MB)

Zone Size (e.g., 512MB)

Zone Start LBA

Zone XZone X - 1 Zone X + 1

Zone Append

• Low scalability on multiple writers to a zone

• Write Queue Depth per Zone = 1

• IOPS: 80K vs 880K using Qemu and 300K vs 1400K on bare metal

• ZAC/ZBC requires strict write ordering

• Limits write performance and increases host overhead

• Big challenge with software eco-system, HBAs, etc.

• Introducing Zone Append

• Append data to a zone without defining offset

• Drive returns where data was written in the zone

-100

100

300

500

700

900

1100

1300

1500

1 2 3 4

K I

OP

S

Number of Jobs

QEMU (i7-6700K, 4K, QD1, RW, libaio)

1 Zone 4 Zones Zone Append / No Zones

-100

100

300

500

700

900

1100

1300

1500

1 2 3 4

K I

OP

S

Number of Jobs

Metal (i7-6700K, 4K, QD1, RW, libaio)

1 Zone 4 Zones Zone Append / No Zones

Zone Lock

Contention

Zone Write Example3x Writes (4K, 8K, 16K) – Queue Depth = 1

4K Write0

8K Write1

16K Write2

WPWP WP WP

Zone

Host takes on the overhead

of serializing I/Os.

Insignificant when using

HDDs

Significant when using SSDs

Zone Queue Depth = 1

Only one Write outstanding per zone

Zone Append Example3x Writes (4K, 8K, 16K) – Queue Depth = 3

4K Write0

8K Write1

16K Write2

WPWP WP WP

Zone

8K Write0

4K Write1

16K Write2

WPWP WP WP

Zone

Zone Queue Depth >= 1

Multiple writes outstanding per zone

Drives takes on the responsibility of

serializing I/Os.

Scalable for both HDDs and SSDs.

ZNS: Synergy w/ ZAC/ZBC software ecosystem• ZAC/ZBC is the interface for SMR hard-drives

• Reuse existing work already applied for ZAC/ZBC hard-drives

• Existing ZAC/ZBC-aware file systems & device mappers “just work”

• Few changes to support to ZNS

• Integrate directly with file-systems or applications

• No host-side FTL

• No 1GB DRAM per 1TB Media requirement

• Code for ZAC/ZBC already in production at technology adopters and broadly available in the Linux® eco-system.

SMR HDD

ZBC/ZAC

ZNS SSD

Applications

libzbc

Logical Block

Device

(dm-zoned)

File-System with

SMR Support

(f2fs, btrfs)

SCSI/ATA

Regular File-

Systems (xfs)

User Space

Linux

Kernel

Block Layer

NVMe

*= Enhanced data paths for SMR drives

util-linux fio

blktests

ZNS Support in LinuxShows up as a host-managed Zoned Block Device

Zone Size = 1GB

2097152 (512B Logical Block)

Zone Size = 1GB

0x200000/2097152 (512B Logical block

size)

Zone InformationList of zones including metadata

File-system IntegrationThe File-System is the “FTL” – Manages mapping table, OP, and GC strategy.

ZNS SSD

Applications

File-System with Zone Support

(f2fs)

User Space

Linux Kernel

Block Layer

NVMe

util-linux

fio

blktests

Format f2fs file-system

Read and Write with from f2fs with an ZNS drive

Western Digital and the Western Digital log are registered trademarks or trademarks of Western Digital Corporation or its affiliates in the US and/or other

countries. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. All other marks are the property of their respective

owners.

Recommended