12
NetTM FPGA-based processors Martin Labrecque Connections 2010

NetTM FPGA-based processors Martin Labrecque Connections 2010

Embed Size (px)

DESCRIPTION

FPGA Soft processors: processors in the FPGA fabric User uploads program to soft processor Easier to program software than hardware for FPGAs Customizable with accelerators Processor(s) DDR controller Ethernet MAC Process packets in software! Fast enough? Soft Processors in FPGAs

Citation preview

Page 1: NetTM FPGA-based processors Martin Labrecque Connections 2010

NetTM FPGA-based processors

Martin Labrecque Connections 2010

Page 2: NetTM FPGA-based processors Martin Labrecque Connections 2010

NetFPGA

• NetFPGA is a network card– Virtex II Pro 50 FPGA– 4 Gigabit Ethernet ports – 1 PCI interface @ 33 MHz– 64 MB DDR2 SDRAM @ 200 MHz

FPGA can implement any digital circuitWhere’s the secret weapon?

Page 3: NetTM FPGA-based processors Martin Labrecque Connections 2010

FPGA

Soft processors: processors in the FPGA fabric User uploads program to soft processor Easier to program software than hardware for FPGAs Customizable with accelerators

Processor(s)

PC

Instr. Mem.

Reg. Array

regA

regB

regW

datW

datA

datB

ALU

25:21

20:16

+4

Data Mem.

datIn

addrdatOut

aluA

aluB

IncrPC

Instr

4:0 Wdest

Wdata

20:13

Xtnd

25:21

Wdata

Wdest

15:0

Xtnd << 2

Zero Test

25:21

Wdata

Wdest

20:0

25:21

Wdata

Wdest

DDR controller

Ethernet MAC

Process packets in software!Fast enough?

Soft Processors in FPGAs

Page 4: NetTM FPGA-based processors Martin Labrecque Connections 2010

The application defines the requirements

Home networking (~100 Mbps/link)

Edge routing (≥ 1 Gbps/link)

Scientific instruments(< 100 Mbps/link)

What is NetTM specifically?

Performance In Packet Processing

Page 5: NetTM FPGA-based processors Martin Labrecque Connections 2010

Multiprocessor System Diagram

InputBuffer

DataCache

OutputBuffer

Synch. Unit

packetinput

packetoutput

Instr.

Data

Input mem.

Output mem.

I$

processor

4-threads

Off-chip DDR

I$

processor

4-threads

Where does performance come from?

Page 6: NetTM FPGA-based processors Martin Labrecque Connections 2010

Multithreaded processors?

• Use concurrent threads to improve throughput• Does that impact the software?

– Yes, if you want to use multiple threads.• Do I have to know how the processor works?

– No.How is it programmed then?

Page 7: NetTM FPGA-based processors Martin Labrecque Connections 2010

Sample programvoid main(void){ while(1) {

char* pkt = get_next_packet(); process_pkt();send_pkt(pkt);

}

}

Threads already exist here!

• Compile by typing ‘make’• Upload to board from host computer• Program starts automatically

Page 8: NetTM FPGA-based processors Martin Labrecque Connections 2010

Packet Processing Example

packet = get_packet();

connection = database->lookup(packet);

if(connection == NULL)

connection = database->add(packet);

connection->count++;

global_packet_count++;

SINGLE-THREADED MULTI-THREADED

Parallelism in the presence of locking?

Ato

mi

cA

tom

i c

packet = get_packet();

connection = database->lookup(packet);

if(connection == NULL)

connection = database->add(packet);

connection->count++;

global_packet_count++;

Page 9: NetTM FPGA-based processors Martin Labrecque Connections 2010

Packet Processing Example

Ato

mi

cA

tom

i c

packet = get_packet();

connection = database->lookup(packet);

if(connection == NULL)

connection = database->add(packet);

connection->count++;

global_packet_count++;No Parallelism

Optimistic Parallelism across Connections

Opportunity for ParallelismMULTI-THREADED

Allow optimistic parallelism in some critical sections!

Page 10: NetTM FPGA-based processors Martin Labrecque Connections 2010

Hardware Transactional Approach • Modify main memory directly: reduce copies, faster commit

DataCache

Data

processor1

Off-chip DDR

processor2

x x

•Detect conflicts prior to corrupting main memory

• Undo changes on transaction abort

Hardware extracts parallelism for you, for free!

Page 11: NetTM FPGA-based processors Martin Labrecque Connections 2010

• System is easy to program in C• Parallel threads deliver performance • Time to results is within minutes

ConclusionsThread1 Thread2 Thread3 Thread4

LOC

KS

Thread1 Thread2 Thread3 Thread4

TRA

NS

AC

TIO

AL x

Page 12: NetTM FPGA-based processors Martin Labrecque Connections 2010

Questions and Discussion

NetThreads v1.0 available onlineSearch for: netfpga+netthreads

For NetTM, ask: Martin [email protected]

For NetFPGA cluster access, ask: Professor Yashar Ganjali

We have lots of machines available for projectsWe hope to see many projects/build synergy