31
1 Israel, May 4, 2010 Managing High Performance Data Pipeline Execution with an FPGA Processor Presenter: Ben Hor – Xilinx, Inc Authors: Glenn Steiner, Dan Isaacs – Xilinx, Inc. David Pellerin – Impulse Accelerated Technologies

Xilinx track g

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Xilinx   track g

1Israel, May 4, 2010

Managing High Performance Data Pipeline Execution with an FPGA Processor

Presenter: Ben Hor – Xilinx, Inc

Authors:Glenn Steiner, Dan Isaacs – Xilinx, Inc.David Pellerin – Impulse Accelerated Technologies

Page 2: Xilinx   track g

2Israel, May 4, 2010

Agenda

1. What is Control Plane/Data Plane Processing and Why Might I Need It?

2. FPGA’s Enable Balancing Computation Between a Processor and Application Specific Logic

3. Implementation of a Control/Data Plane System is Straightforward

4. Case Study: An HD Video Recognition System

5. Connecting the Embedded Processor to the FPGA with Linux

6. Summary

Page 3: Xilinx   track g

3Israel, May 4, 2010

What is Control Plane / Data Plane Processing and Why Might I Need It?

Page 4: Xilinx   track g

4Israel, May 4, 2010

Challenge Example: HD Video Streaming

• 720P 74.25 MHz Pixel Rate – 222.75 MBs data rate– Hypothetical Dual Core – 2.5GHz, Dual Issue

(2 instructions per clock)• 10 GHz Instruction Rate 22.4 instructions per byte of data processed

• What about OS overhead– Task switching times– Interrupt latency– All bus bandwidth eaten up with video data

• Can’t Do It With a Standard Processor

Page 5: Xilinx   track g

5Israel, May 4, 2010

Coprocessing: An Effective Way of Accelerating Software

• Distributes the load

• Move computational load where it belongs

• Dedicated processing element(s) provide dramatic acceleration

Page 6: Xilinx   track g

6Israel, May 4, 2010

A Look at Coprocessing Architectures

• Fully Decoupled– Common, but not interesting for this topic

• Single / Multi-Instruction Accelerator– FPU

• Loosely Coupled - Separated Functions– Message / Control Passing– Typically Used for Control Plane / Data Plane

Processing

Page 7: Xilinx   track g

7Israel, May 4, 2010

What is Control Plane / Data Plane

DataIn

DataOut

Control PlaneProcessor

(OS)UserInterface Processor Bus or

Dedicated Control Channel(s)

Coprocessor CoprocessorCoprocessor

Control Plane

Data Plane

Page 8: Xilinx   track g

8Israel, May 4, 2010

Control / Data Plane Example

• Control plane: controls the state of network elements– Route selection– RSVP, capability signaling, etc.– Exception handling

• Data plane: manages data packets – Packet forwarding– Packet differentiation– Buffering, link scheduling

Adapted from: Active correlation between the control and data plane – Z. Morley Mao

Page 9: Xilinx   track g

9Israel, May 4, 2010

FPGA’s Enable Computation Balancing Between a Processor and Application Specific Logic

Page 10: Xilinx   track g

10Israel, May 4, 2010

FPGAs: Ideal for Coprocessing

• Tight integration between FPGA & Processor– Reduced Latency– Matched clock rates

• Configure the processors to meet system requirements– Configure Processors– Configure the Coprocessors

• Flexible logic enables experimentation

Page 11: Xilinx   track g

11Israel, May 4, 2010

External Processor Challenges

• Latency for control signals to coprocessor• Pin challenges

– Many pins reduce latency but at higher power & part cost– High speed serial (PCIe) minimizes pins at cost of latency & power

• May not be the lowest cost solution

• FPGA embedded processors solve these challenges and enable performance balancing

Page 12: Xilinx   track g

12Israel, May 4, 2010

Implementation of A Control Plane / Data Plane System is Straight Forward

Page 13: Xilinx   track g

13Israel, May 4, 2010

Building The Control Plane / Data Plane System

• Assemble the Control Plane processor

• Assemble the Data Pipeline– Combining IP generated by multiple tools– C to HDL Tools may be an effective option

• Control the Pipe with Processor and OS

Page 14: Xilinx   track g

14Israel, May 4, 2010

Assemble the Control Plane Processor

Page 15: Xilinx   track g

15Israel, May 4, 2010

Page 16: Xilinx   track g

16Israel, May 4, 2010

Page 17: Xilinx   track g

17Israel, May 4, 2010

Multiple Languages/Tools/Flows to create Coprocessors

– Low Level• Hand Crafted - RTL (VHDL/Verilog)

– High Level• Matlab / Simulink • ‘C’ to FPGA (HDL)• ‘C’ Variants

Assemble and Connect the Data Plane

Page 18: Xilinx   track g

18Israel, May 4, 2010

CASE STUDY:HD VIDEO RECOGNITION SYSTEM

Page 19: Xilinx   track g

19Israel, May 4, 2010

The Case Study Problem

• 720P HD Video Stream– DVI Input and DVI Output

• Locate the clown fish in the video• Highlight the clown fish• Continuously track the fish• Adjust spotlight size based

upon likelihood of match

Page 20: Xilinx   track g

20Israel, May 4, 2010

The Architected Solution

• How Control Plane Processor Was Created

• How the Data Processing Pipeline Was Created

Page 21: Xilinx   track g

21Israel, May 4, 2010

Base Processor Reference Design

XilinxMicroBlazeProcessor

Block RAM

SystemAceCompact Flash

ICC

GPIOLEDs

GPIODIP Switch

DebugModule

UART

MultiportMemory

Controller

DDR2Memory

GPIOPush Buttons

ClockGenerator

ResetModule

Linux

Page 22: Xilinx   track g

22Israel, May 4, 2010

DVI Pass-through Reference DesignBasic “real-time” video processing

DVI Input

DVI Output

Image Processing

Page 23: Xilinx   track g

23Israel, May 4, 2010

DVI Pass-through Reference DesignBasic “real-time” video processing

DVI InGamma

In2D FIR Filter

DVI OutDVI In DVI OutDE GenGamma

Out

Image Processing

DVI Input

DVI Output

Streaming pixel processing Streaming video data MicroBlaze controls filter coefficients in “real-time”

Simple design example for customer IP integration

System Generator Custom video accelerator pcore

Page 24: Xilinx   track g

24Israel, May 4, 2010

Integrated Control/Data Plane System

DVIDVIIn

GammaIn

GammaOut

DVIOut

XilinxMicroBlazeProcessor

System

The processor is used to dynamically configure

filters

Processor Local Bus (PLB)

2D FIRFilter

DVI

Filtercontrol(UART)

ObjectDetection

New Pipeline Element

Page 25: Xilinx   track g

25Israel, May 4, 2010

HD Object Detection & Highlighting

Page 26: Xilinx   track g

26Israel, May 4, 2010

Connecting the Embedded Processor to the FPGA with Linux

Page 27: Xilinx   track g

27Israel, May 4, 2010

Control the Pipe with Linux

• Linux is Now the #1 OS for Embedded FPGA Systems

• Newest Generation Is More “Real-Time”• Large Public Code Base• Mostly Free

• FPGA IO Drivers Available

Page 28: Xilinx   track g

28Israel, May 4, 2010

Configure Linux for the IO Device

1. // Load the custom driver into Linux kernelmodule_init(xll_example_init);

2. // Register driver to specific device number - 253err = register_chrdev_region(devno, 1, "custom_io_example");

bash# mknod /dev/custom_io_example0 c 253 0

Page 29: Xilinx   track g

29Israel, May 4, 2010

Controlling the Data Pipewith the Linux Application

1. // Open custom I/O device from Linux applicationint custom_io_ex_open(struct inode *inode, struct file *filp)

2. // Read / Write to custom peripheral I/O using standard Linux read/function function calls– ssize_t custom_io_ex_read(struct file *filp, char __user *buf, size_t

count, loff_t *f_pos)

– ssize_t custom_io_ex_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos)

Page 30: Xilinx   track g

30Israel, May 4, 2010

SUMMARY• FPGAs enable computational balancing between an FPGA

based processor and a data processing pipeline reducing development risks

• Offloading streaming data processing tasks to an FPGA data-plane processing pipeline can enable meeting performance objectives

• An FPGA based single chip control-plane and data-plane processing solution can reduce cost and development time

• Offloading enables Processor to handle multitude of other tasks

Page 31: Xilinx   track g

31Israel, May 4, 2010

Thank You

Glenn Steiner, Dan Isaacs – Xilinx, Inc.David Pellerin – Impulse Accelerated Technologies