16
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Embedded Transport Acceleration

Intel Xeon Processor as a Packet Processing Engine

Abhishek Mitra

Professor: Dr. Bhuyan

Page 2: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Why do we need PPE / TOE

• The problem is that TCP termination– Involves reconstructing a stream of coherent data

from many independent packets– Compute-intensive task– Requires roughly 10 times performance as TCP

routing– A 400-MHz MIPS CPU consumes all of its cycles

trying to terminate a Fast Ethernet 100Mbps channel– A 200-MHz IXP1200 has similar TCP performance.

Page 3: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Packet Processing Engine

• Computing and Memory Resources– Necessary for communication processing– Scalable (throughput)– Extensible (Newer Protocols, and

applications)– Programmable (changing Standards)

• Intel Xeon is extensible and programmable– Future (Multi core in a single chip)

• Particular idea why ETA is being researched

Page 4: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

ETA S/W Architecture

• Host and Server Partitioning– Host

• General purpose OS and application processes

– PPE• All communication centric tasks are processed

– Interface• Asynchronous queues in a cache-coherent, shared

host memory

Page 5: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

ETA S/W StackN

A

T

I

V

E

T

C

P

/

I

P

A

C

C

E

L

E

R

A

T

E

D

Page 6: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

ETA host-engine interface

• Set of queuing structures (DTI)• DTI (Direct Transport Interfaces)

– Based on Infiniband and VI Architecture – DTI also supports TCP connection commands– Buffer pools to buffer TCP streams– Parent DTIs listen on new TCP connections– When ETA host accepts a new connection

a child DTI is created to service the TCP session

Page 7: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

DTI Structure

Page 8: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Direct Transport Interface

• Send Queue, Receive Queue [Host to PPE, vice versa]

• Event Queue [Post Event notice to Host]• Doorbells [Host writes signals directly to ETA

PPE]• Data buffers [ETA PPE buffers data when

– Source / target buffers are not pre-conditioned– PPE receives TCP segments w/o receive descriptors

on receive queue– TCP segments are out of order

Page 9: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

ETA PPE SW

• ETA architecture: Independent of PPE implementation– Fixed device, a specialized engine, or a CPU– ETA aware PPE must support DTI structures– Execute packet processing function on behalf

of host (termination of TCP / IP)– Support an interface to the network

Page 10: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

The Prototype

• Dual Processor (Xeon CPUs)– Host CPU0– PPE CPU1

• Establish and terminate TCP/IP sessions on behalf of host

– No special hardware developed– Use of standard tools– Gigabit Ethernet cards with modified drivers– Shared memory interface between host and

PPE

Page 11: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

SW Environment

• Linux Kernel 2.4• PPE SW is a loadable kernel module

– Supports DTI– Affinity for one processor (CPU1)– Never yields control of processor, implying

dedicated use of CPU1 as PPE– PPE polls NIC descriptors in shared memory – DTI structures in shared host memory– CPU and PPE communicate via doorbells

Page 12: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Hardware Platform

• The Prototype can run on any Linux multiprocessor kernel

• One server, with five Ethernet links

• Five clients are cots servers running Linux and TTCP

Page 13: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

ETA Test Environment

Page 14: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Measurement and Analysis• Comparison between ETA and standard Linux dual processor server

– ETA leaves more than 80% of CPU idle– Tx throughput increases considerably– Receive performance lower, because ETA uses memory-memory copy from packet buffer to destination

buffer

Page 15: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Performance with HTHT results in ~ 50% increase in Tx performance

Receive performance lower, because ETA uses memory-memory copy from packet buffer to destination buffer

ETA HT NoCopy: Test path, w/o data copy, enhanced Rx performance

Page 16: Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan

Related Work

• TOEs have been developed – Devices attached to the server’s I/O

subsystem – Use separate specialized processing and

memory resources

• ETA uses processing and memory resources of the server instead

EOP.