View
217
Download
0
Category
Tags:
Preview:
Citation preview
VIA and Its Extension To TCP/IP Network
Yingping Lu (lu@cs.umn.edu)Based on Paper “Queue Pair IP, …” by Philip Buonadonna
Motivation
High performance computing, clustering applications require high-throughput, low-latency communications facilityTraditional TCP/IP is not designed for high-throughput, low-latency communicationsApplication software has not kept pace with the increase of I/O speed
Memory copy Checksum Computation Interrupt Context Switching
Bandwidth ComparisonBandwidth Comparison
0
200
400
600
800
1000
256
512
1024
2048
4096
8192
16384
32768
Message Length (Bytes)
Th
rou
gh
pu
t (M
B/s
)
TCP/IP
VIA
VIA Solution
VIA is a industry standard convened by Microsoft, Compaq, Intel.Key features of VIA: Reduce memory copy (Zero-copy) Direct user level access to NIC hardware Eliminate OS kernel from critical path Collapse ISO/OSI model Offload CPU processing to intelligent NIC
VIA Components
Consumer The end entity to use VIA function to communicate,
can be user-level or kernel Use VIPL for programming
VI User Agent Implements OS bypassing agent
Kernel Agent Device driver, handle security and OS-related issues
VIA-capable NIC (Channel Adapter) Implements VIA communications
Programming Abstraction
Queue Pairs Components
Send queue Receive queue Completion queue (status)
Data Movement Operations Send/Receive RDMA Read RDMA Write
Memory AccessMemory Registration Memory must registered before use System pins out the memory region Nic use DMA to transfer data from memory to Nic
Memory Protection Registered memory are associated with a VI
consumer and only valid to the VI consumer
Gather/Scatter list Gather list: a list of registered source data buffers
(read) Scatter List: a list of registered destination data
buffers (write)
Descriptor
A work queue element to be placed into queue pair (send or receive queue)Contains control segment and a list of address segmentSpecifies operation command, memory address, size
Door Bell
An asynchronous mechanism to notify VI NIC of a new work queue postDoor Bell can be a register in NIC accessed by both CPU and NIC
VIPL
VI NIC
Descriptor
01
Operation Example –Send/Receive
Sender: Consumer:
Register send buffer
Post a Send work queue element
Channel Adapter: Send out the data
and header, data are retrieved directly from consumer memory
Receiver Consumer:
Register receive buffer Post a receive buffer in
the receive queue Channel Adapter:
Receive packets from sender
Find out a receive queue element in the receive queue
Move data directly to the buffer specified in the receive queue element
Operation Example - RDMA Write
Initiator Consumer:
Register sending buffer address
Get receiver’s address Post a RDMA Write
Channel Adapter Send out data with
header(the operation, receiving address), data are retrieved directly from sender buffer
Receiver Consumer
Register receiving buffer address
Send the address, R-key and length to initiator
Channel Adapter Receive data Check the validity of
address in RDMA header
Move data directly to the memory specified in the RDMA header
Summary of VIA
Goal: low-latency, high-throughput by offering direct access to NIC, Zero copyArchitecture components: consumer (VIPL), UA, KA, VI-NICMain concepts: queue pairs, memory pin, gather/scatter, descriptor, door bellOperations: Send/Receive, RDMA Read, RDMA Write
Why QP/IP
TCP/IP network is robust, ubiquitousHowever, TCP/IP is not designed for high-performance, low-latency purposeQueue Pair abstraction provides a way to offload CPU processing, reduce the critical data path, provide memory zero copyThe Integration of QP and IP may be able to reduce the latency, improve the throughput between end-end node applications connected through TCP/IP network
Challenges to QP/IP
Provide a VIPL supporting QP/IPIntegration of connection setupHandle message segmentation Implement TCP/IP mechanism at NICHandle message boundary for TCPHandle zero-copy in the event of packet loss
QPIP Components
FSM: Doorbell FSM Sched/XMT FSM RECV FSM Mgmt FSM
Major Data Abstract QPs CQs TCP Control Block (TCB)
QPIP Prototype
Three components Application Library
PostSend(), PostRecv(), Poll(), Wait() Kernel driver
Initialization Address mapping mechanism Interrupt service
Network interface firmware Implement TCP, UDP, IPV6 protocols
Summary
Integrate the QP concept from VIA with the ubiquitous TCP/IP networkProvide low-latency, high throughput for SANQP/IP contains doorbell FSM, Sched/XMT FSM, RECV FSM, Mgmt FSM. It also contains QPs, CQs, TCB data structure.Demonstrate comparable performance, much lower CPU utilization with modest hardware.The programmability also adds flexibility to adapt with the evolvement of TCP/IP and scheduling requirements.
IssuesHow to integrate TOE in the mechanism?How to effectively handle message boundary in TCP to support upper level application, I.e. iSCSI? How to handle segmentation?How to support zero-copy in the case of packet loss?How to extend this into a WAN environment (more unpredictability, fluctuation of latency, available bandwidth, congestion, LFN)?How to effectively support OSD communication?
Recommended