27
Cao, Gang Network Platforms Group, Intel Corporation May 2018

Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Cao, Gang

Network Platforms Group, Intel Corporation

May 2018

Page 2: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Notices & DisclaimersIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration.

No computer system can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks .

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks .

Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system.

Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

© 2018 Intel Corporation. Intel, the Intel logo, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as property of others.

Page 3: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 3

Agenda• Why QoS needed

• Rate limiting on Bdev internals

• Extension of QoS on vBdev

• Future work

Page 4: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 4https://creativecommons.org/licenses/

Page 5: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 5https://creativecommons.org/licenses/

Page 6: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

QoS in SPDK• Implemented in the common Bdev layer

Below the application protocols (e.g., iSCSI, NVMe-oF, vhost…)

Above the real backend devices (e.g., NVMe SSD, AEP, Remote Connected Disk…)

• Strict Rate Limiting (v18.04)

IOPS

Page 7: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Backend

QoS USAGE1. Start the SPDK target

2. Assign the bdev to the application

3. Send I/Os to the bdev

4. Enable QoS (i.e., IOPS rate limiting) on the bdev via RPC

Example: NVMe0n1 with initial 100K IOPS

5. Adjust the QoS at runtime on demand via RPC

Example: Update NVMe0n1 limit to 50K

6. Disable the QoS at any time via RPC

SPDK Target

Bdev

Application

100K50K

I/O

Page 8: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 8

How do We achieve Rate limiting on bdev?

Page 9: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Rate Limiting Key Considerations1. Based on Bdev Core Design

a. I/O Channel, Bdev Channel, Bdev Stats

2. Based on common SPDK Concepts

a. Asynchronous, Lockless, Event Driven

3. Friendly Policy

a. Clear workflow on I/O handling

b. Extensible for other policies

Page 10: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Rate Limiting Work FlowMajor SPDK architectural components:

1. Protocol layer for applications

2. Bdev layer for QoS on exposed devices for applications

3. Logical volume layer for managing backend shared devices

Out-of-band RPC method for QoS management besides start-up QoS setting

Storage Protocols Conf

Block Device Abstraction (BDEV)

NVMe

NVMe PCIe Driver

NVMe SSDs

SPDK Stack

Application

NVMeoF/iSCSI

Storage Applications

VM Container

Manage-ment

Hypervisor/OS storage stack (SCSI, Virtio)

Data Path

JSONRPC

Vhost

AppStack

Logical Volume

1. App I/O1. App I/O

2. Rate Limiting

3. Device I/O

Enable, AdjustDisable, Query

Control Path

Start-up

Page 11: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Rate Limiting Resource ManagementI/O Thread

1. Create I/O Channel (associated bdev channel)

QoS Thread

2. Create QoS bdev channel & assign QoS thread

3. Create & Register QoS Poller

QoS Configured

RPC Enable QoS

I/O Handling

4. Destroy I/O Channel

Cleanup I/O for that I/O Channel

5. Destroy Completion

6. Unregister Poller and Destroy QoS Bdev Channel

When All I/O Channels Destroyed

RPC Disable QoS

Cleanup I/O for all I/O Channels

Page 12: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Rate Limiting Normal I/O HandlingI/O Thread

QoS Thread

1. I/O received

2. Sent I/O via Message

3. Queue I/O 4. Send queued I/O down 5. Handle I/O completion

6. Sent I/O Completion via Message

7. Call I/O Completion Callback

Periodically Run

Check Allowed I/O

Page 13: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders https://creativecommons.org/licenses/

Page 14: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Extension: Per Client Qos on Shared Backend

https://creativecommons.org/licenses/

QoSQoSQoSQoS

QoS

QoS

QoS

QoS achieved at SPDK Target SideQoS saw at Client Side

QoS

QoS

QoSBdev vBdev

Dev

Offer the QoS control at target side

Page 15: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Future qos Work1. Rate Limiting (v18.07)

Bandwidth

Read / Write separate control

2. I/O Prioritization

Data & Metadata

Read & Write

3. Along with more functionalities from common SPDK Bdev

Page 16: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 16

IRC

Mailing list

Community meeting

Trello

Github

Gerrithub

Page 17: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 17

Thank you

Page 18: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,
Page 19: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 19

Backup

Page 20: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Architecture

Drivers

StorageServices

StorageProtocols

iSCSI Target

NVMe-oF*Target

SCSI

vhost-scsiTarget

NVMe

NVMe Devices

Blobstore

NVMe-oF*

Initiator

Intel® QuickDataTechnology Driver

Block Device Abstraction (bdev)

Ceph RBD

Linux AIO

Logical Volumes

3rd Party

NVMe

NVMe* PCIeDriver

Released

1H‘18

vhost-blkTarget

BlobFS

Integration

RocksDB

Ceph

Core

ApplicationFramework

GPT

PMDKblk

virtioscsi

VPP TCP/IP

QEMU

Cinder

QoS

Linux nbd

RDMA

DPDKEncryption

virtioblk

Page 21: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

The SPDK app Framework provides the glue

21

How do we combine SPDK components FOR QOS?

Page 22: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Reactor 1Reactor 0 Reactor N…

Core 0 Core NCore 1

Po

ller

Events

Po

ller

… Po

ller

… Po

ller

I/O Device I/O Device I/O Device…

Events Events

22

App Framework Components

Event

Reactor

Poller

I/O ChannelI/O Device

Page 23: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 23

PollerEssentially a “task” running on a reactor

Primarily checks async events

Can run periodically on a timer

Example: poll hardware completion queue

Callback runs to completion on reactor thread

Completion handler may send an event

CQ

I/O Device

Poller

I/O completion callback

Submit I/O

SQ

Page 24: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 24

EventCross-thread communication

Function pointer + arguments

One-shot message passed between reactors

Multi-producer/single-consumer ring

Runs to completion on reactor thread

Reactor A

Events

Reactor B

Allocate andcall event

Execute and free event

Core 0 Core 1

Page 25: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders 25

I/O ChannelAbstracts hardware I/O queues

Register I/O devices

Create I/O channel per thread/device combination

Provides hooks for driver resource allocation

I/O channel creation drives poller creation

Pervasive in SPDK

I/O Device

Page 26: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

CoreHardware

Software

Thread

I/O Channel

Bdev Channel SPDKThread Model

Bdev (Block Device)

Run to Completion Lockless Scalable

SPDKEnvironment

BDEV Components

Page 27: Cao, Gang Network Platforms Group, Intel Corporation May 2018 · Intel® Builders QoS in SPDK •Implemented in the common Bdev layer Below the application protocols (e.g., iSCSI,

Intel® Builders

Per Client QoS on Shared Backend Details1. vBdev is also a Bdev

2. QoS control path and data path on common Bdev module and also work for vBdev

3. Create Pass Through vBdevs on shared Bdev as the N : 1 mapping

4. Each client views its Dev from the exclusive vBdev

5. Apply QoS on each vBdev to achieve per Client’s QoS requirement

6. No need to have the QoS control at the client side

Note: for those vBdevs sharing same Bdev. 1 vBdev is Read/Write and N-1 is Read only.