16
1 2004 MAPLD/205 Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist Castel Systems Inc. & Dept. Physics and Astronomy George Mason University Fairfax, VA [email protected]

12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

Embed Size (px)

Citation preview

Page 1: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

12004 MAPLD/205 Tirat-Gefen

Mapping of scalable RDMA protocols to ASIC/FPGA

platforms

Yosef Gavriel Tirat-Gefen, PhD

Senior Member IEEE

Chief Scientist

Castel Systems Inc.

& Dept. Physics and Astronomy

George Mason University

Fairfax, VA

[email protected]

Page 2: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

22004 MAPLD/205 Tirat-Gefen

Presentation Overview• Motivation

• TCP Off-loading

• Zero-copying

• RDMA protocol

• RDMA protocol stack

• Structure of a RDMA card

• Results

• Conclusion

Page 3: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

32004 MAPLD/205 Tirat-Gefen

Motivation

Enabling high-bandwidth WAN applications

Supercomputer or Server farmWAN

Terabyte storage

Terabyte storage

Workstation

Supercomputer or Server farm

Page 4: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

42004 MAPLD/205 Tirat-Gefen

Applications• Distributed Command and Control.

• Signal processing (e.g. RADAR)

• Sharing of intelligence data real-time.

• Distributed large scale computation/ simulation of aerospace problems.

• Extension of storage area networks over a wide area network (WAN).

• Enabling technology for modern supercomputing installations.

Page 5: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

52004 MAPLD/205 Tirat-Gefen

Traditional TCP/IP Networking

Application/O.S.

TCP

Layer 3 (IP)

Layer 2 (MAC)

Layer 1 (PHY)

Application/O.S.

TCP

Layer 3 (IP)

Layer 2 (MAC)

Layer 1 (PHY)

Layer 3

Layer 2

Layer 1

Layer 3

Layer 2

Layer 1

Router

Page 6: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

62004 MAPLD/205 Tirat-Gefen

Standard Data Flow on TCP/IP

Application AMemory Space

TCP Buffer/StackMemory Space

TCP Buffer/StackMemory Space

Application BMemory Space

WAN/LAN

L3 L2 L1 L1 L2 L3

Page 7: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

72004 MAPLD/205 Tirat-Gefen

Standard Data Flow on TCP/IP• Traditional TCP/IP copies data from application to TCP memory buffer

• Leads to CPU lost cycles in buffer copying

• CPU gets overwhelmed to rates above 2.5 Gbps

• TCP/IP off-loading is a help but it does not solve the problem on the receiver side

Page 8: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

82004 MAPLD/205 Tirat-Gefen

TCP/IP off-load processing

Application/O.S.

TCP

Layer 3 (IP)

Layer 2 (MAC)

Layer 1 (Phy)

Application/O.S.

TCP/IP offload

Processor (TOE)

Mapped to hardware

Page 9: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

92004 MAPLD/205 Tirat-Gefen

Zero-copying and TCP offloading processing

TCP off-load Processor TOE/NIC Card

Host Main Memory

Host CPU Cache Memory

Host CPU

Network buffer Receive BufferWAN/LAN

Page 10: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

102004 MAPLD/205 Tirat-Gefen

Zero-copying and TCP offloading processing

• Zero-copying is still not achieved as receiver buffer is still copied back to application memory space

• TCP/IP off-loading is not scalable

• RDMA protocols provide a solution

Page 11: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

112004 MAPLD/205 Tirat-Gefen

RDMA data-flow for WAN applications

Host MemoryHost Memory

WAN

Host CPU A Host CPU BApplication

MemorySpace

Application Memory

Space

RDMA NIC Card RDMA NIC Card

Page 12: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

122004 MAPLD/205 Tirat-Gefen

Scalable WAN-RDMA for bandwidths above 10 Gbps

RDMA NIC Card for WAN

RDMA Engine

Rx Buffer

MAC PHY WAN

10 Gbps links

Host

> 10 Gbps

DMA channel

Tx Buffer

Page 13: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

132004 MAPLD/205 Tirat-Gefen

The RDMA protocol layers and our prototype

ULP (e.g. iSCSI, NFS)

RDMA

DDP

MPA SCTP

TCP

Layer 3 (e.g. IP)

Layer 2 (MAC)

Layer 1 (PHY)

FPGA implementation

FPGA andoff-the-shelfMAC/PHY chips

Running on HostCPU

Page 14: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

142004 MAPLD/205 Tirat-Gefen

Overall Hardware/Firmware Organization of the WAN RDMA card PCI-Express/Hyper-transport Interface

RDMA Protocol Engine

SCTP Protocol Engine

Data stream split/join unit

10GE/OC-192 framer

Layer 3 (IP) Processor

SARSARSARSAR

10GE/OC-192framer

10GE/OC-192framer

10GE/ OC-192 framer

Tx Memorycontroller

Rx Memorycontroller

Rx MemoryBank

Rx MemoryBank

PHY PHY PHY PHY

IP/Firmwaremodule

Page 15: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

152004 MAPLD/205 Tirat-Gefen

Present Results• Currently using Virtex-II/Virtex-IIPro (Xilinx) as target devices

for our cores

• Data indicate that most of the key cores will fit one FPGA device (Virtex-II)

• Aggregate of all cores is spanning several FPGAs

• Intra-device communication is a issue, need to be careful with PCB design.

• We are currently trying to accommodate most of the cores in one FPGA.

•Most of the cores will be made available free-of-charge to researchers in non-profit or government organizations.

Page 16: 12004 MAPLD/205Tirat-Gefen Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist

162004 MAPLD/205 Tirat-Gefen

Conclusion

• Advent of Hyper-transport/ PCI-Express and VITA (embedded computing) standards will enable I/0 bandwidths above 10 Gbps locally

• Extension of RDMA protocol enables large bandwidths over wide area networks

• The proposed cores will fulfill the natural growth of bandwidth requirements in commercial/defense/aerospace applications.