23
SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet Mao Miao, Fengyuan Ren, Xiaohui Luo, Jing Xie, Qingkai Meng, Wenxue Cheng Dept. of Computer Science and Technology, Tsinghua University,

SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Embed Size (px)

Citation preview

Page 1: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA: Rekindling High PerformanceSoftware RDMA over Commodity Ethernet

MaoMiao,Fengyuan Ren,Xiaohui Luo,JingXie,Qingkai Meng,Wenxue Cheng

Dept. of Computer Science and Technology, Tsinghua University,

Page 2: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Background

• Remote Direct Memory Access (RDMA) - Protocoloffload

• reduceCPUoverheadandbypass kernel - Memorypre-allocationandpre-registering- Zero-Copy

• Datatransferreddirectlyfromuserspace

• DomainRDMANetworkProtocols- InfniBand (IB)- RDMAOverConvergedEthernet(RoCE)- InternetWideAreaRDMAProtocol(iWARP)

Page 3: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Background

• InfniBand (IB)- Customnetworkprotocolandpurpose-builtHW- LosslessL2useshop-by-hop,credit-basedflowcontroltopreventpacketdrops

• Challenges- Incompatiblewith Ethernetinfrastructure- Costmuch

• DCoperatorsneedtodeployandmanagetwoseparatenetworks

Page 4: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Background• RDMA Over Converged Ethernet (RoCE)

- IndeedIBoverEthernet- Routability

• RoCEv2currentlyincludesUDPandIPlayerstoactuallyprovidethatcapability

- LosslessL2• Priority-basedFlowControl(PFC)

• Challenges- Complex andrestrictiveconfiguration- Perils ofusingPFC inalarge-scaledeployment

• head-of-lineblocking,unfairness,spreadingcongestionproblems,etc

Page 5: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Background

• Internet Wide Area RDMA Protocol (iWARP)- EnablesRDMAovertheexistingTCP/IP

• LeveragesTCP/IPforreliabilityandcongestioncontrolmechanismstoensurescalability,routabilityandreliability

- OnlyNICs(RNIC)shouldbespeciallybuilt,nootherchangesrequired

• Challenges- Specially-builtRNIC

Page 6: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Motivation

• Common challenges for RDMA deployment- Specificandnon-backwardcompatibledevices

• Ethernetnon-compatibility• Inflexibility ofHWdevices

- Expensive andhighcost• Equipmentreplacement• Extraburdenofoperationmanagement

• Is it possible to design software RDMA (SoftRDMA) over commodity Ethernet devices ?

Page 7: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Motivation• Software Framework Evolution

- High-performance packet I/O• IntelDPDK,netmap,PacketShader I/O(PSIO)

- High-performance user-level stack• mTCP,IX,Arrakis,Sandstorm

- Drivethetechnicalchanges• Memoryresourcespre-allocationandre-use• Zero-copy• Kernelbypassing• Batchprocessing• Affinityandprefetching

Page 8: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Motivation

• Much similar design philosophy between RDMA and novel SW evolvements - Memoryresourcespre-allocation- Zero-copy- Kernelbypassing

• Can we design SoftRDMA based on the high-performance packet I/O ?- ComparableperformancetoRDMAschemes- Nocustomizeddevicesrequired- CompatiblewithEthernetinfrastructures

Page 9: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Design: Dedicated Userspace iWARP Stack

Applications

Verbs API

RDMAP

DDP

MPA

TCP

IP

NIC driver Data Link

DPDK

Userspace

Applications

Verbs API

RDMAP

DDP

MPA

TCP

IP

NIC driver Data Link

User Space

Kernel Space

Applications

Verbs API

RDMAP

DDP

MPA

TCP

IP

NIC driver Data Link

User Space

Kernel Space

• User-leveliWARP +Kernel-levelTCP/IP• Kernel-leveliWARP +Kernel-levelTCP/IP• User-leveliWARP +User-levelTCP/IP

Page 10: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Design: Dedicated Userspace iWARP Stack

• In-kernel stack- Modeswitchingoverhead- Complexityofkernelmodification

• User-level stack- Eliminatemodeswitchingoverhead- Morefreespaceforstackdesign

Page 11: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

One-Copy versus Zero-Copy• Seven steps for Pkts from NIC to App• Two-Copy

- Step6:copiedfromRXringbufferforstackprocessing- Step7:copiedtodifferentApps’bufferafterprocessing

123456

7 8 …

Device Driver

RX Ring Buffer

NIC1

DMA Engine

Interrupt Generator

NIC InterruptHandler

SoftirqNet_rx_action

NIC

1

IP

TCP

App1 read()

HardwareInterrupt

Netif_rx_schedule()Raised softirq check

1

2

3

Poll_queue()

dev->poll

App2 read()

…U

ser Space

Kernel

5

Stack Processing6

7 App recv

Control flow Data flow

1st COPY

4

2nd COPY

1

2

3

4

5

6

7

Page 12: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

One-Copy versus Zero-Copy

123456

7 8 …

Device Driver

RX Ring Buffer

NIC1

DMA Engine

Interrupt Generator

NIC InterruptHandler

SoftirqNet_rx_action

NIC

1

IP

TCP

App1 read()

HardwareInterrupt

Netif_rx_schedule()Raised softirq check

1

2

3

Poll_queue()

dev->poll

App2 read()

…U

ser Space

Kernel

5

Stack Processing6

7 App recv

Control flow Data flow

1st COPY

4

2nd COPY

• One-Copy- MemorymappingbetweenkernelanduserspacetoremovethecopyinStep7

• Zero-Copy- SharingtheDMAregiontoremovethecopyinStep6

7

6

Page 13: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

One-Copy versus Zero-Copy

• Two obstacles for Zero-Copy in stack processing- WheretoputtheinputPkts fordifferentAppsandhowtomanagethem?

• Unaware abouttheapplication-appointedplace tostoretheinputPkts beforestackprocessing

- WhethertheDMAregionislargeenoughorcouldbereusedfasttoholdinputpackets?

• TheDMAregionisfiniteandfixed,whichcouldonlystoreuptothousandsofinputpackets

• SoftRDMA adopts One-Copy

Page 14: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Threading Model

• Traditional multi-threading model (c1)- OnethreadforAppprocessing,theotherforPkts’RX/TX- Goodforthroughputasbatchprocessing- Higherlatencyasthethreadswitchingandcommunicationcost

TCP/IP TCP/IP

AppEvent

ConditionsBatchedSystcalls

(c1)

Thread 1 Thread 2

Page 15: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Threading Model

• Run-to-completion threading model (c2)- Runallstages(Pkt RX/TX,APPprocessing…)intocompletion- Indeedimprovethelatency- SophisticatedprocessingmaymakethePkt loss

TCP/IP TCP/IP

App

NIC Driver

User Space

(c2)

Thread 1 Thread 2

Page 16: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Threading Model

• SoftRDMA threading model (c3)- OnethreadforPkts’RX,includingOne-Copy,theotherforAppprocessingandPkts’TX

- AcceleratethePkt receivingprocess- AppprocessingandPkts’TXrunwithinathreadtoimprovetheefficiencyandreducethelatency

TCP/IP TCP/IP

App

(c3)

Thread 1 Thread 2

Page 17: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Implementation

• 20K lines of code, 7.8K are new- DPDKI/O- User-levelTCP/IPbasedonlwIP rawAPI- MPA/DDP/RDMAP layerofiWARP

• RDMA Verbs

Page 18: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Performance

• Experiment config- DELLPowerEdgeR430- Intel82599ES10GbE NIC- Chelsio T520-SO-CR10GbERNIC

• Four RDMA implementation schemes- Hardware-supportedRNIC(iWARP RNIC)- User-leveliWARP basedonkernel-socket(KernelSocket)- User-leveliWARP basedonDPDK-basedlwIP sequentialAPI(SequentialAPI)

- User-leveliWARP basedonDPDK-basedlwIP rawAPI(SoftRDMA)

Page 19: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Implementation• Short Message (≤ 10KB) Transfer

- Thecloselatencymetric• SoftRDMA:6.63us/64B6.80us/1KB52.20us/10KB• RNIC: 3.59us/64B5.29us/1KB16.27us/10KB

- Thethroughputfallsfarbehind• Acceptableforshortmessagedelivering

Page 20: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

SoftRDMA Implementation• Long Message (10KB-500KB) Transfer

- Thecloselatencymetric• SoftRDMA:101.36us/100KB500.06us/500KB• RNIC: 93.45us/100KB432.50us/500KB

- Theclosethroughputperformance• SoftRDMA:1461.71Mbps/10KB 7893.31Mbps/100KB• RNIC: 8854.16Mbps/10KB 8917.44Mbps/100KB

Page 21: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Next work

• Amorestable and robust user-levelstack• NICs’HW features utilized toacceleratetheprotocolprocessing- TSO/LSO/LRO- Memorybasedscatter/gatherforZero-Copy

• Morecomparisonexperiments- TestsamongSoftRDMA,iWARPNIC, RoCENIC- Testson40GbE/50GbEdevices

Page 22: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Conclusion

• SoftRDMA:ahigh-performancesoftwareRDMAimplementationovercommodityEthernet- Thededicateduserspace iWARP stackbasedonhigh-performancenetworkI/O

- One-Copy- Thecarefullydesignedthreadingmodel

Page 23: SoftRDMA: Rekindling High Performance Software …conferences.sigcomm.org/events/apnet2017/slides/softrdma.pdfSoftware RDMA over Commodity Ethernet ... control to prevent packet drops

Thanks!Q&A