Upload
alonagradman
View
975
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
1Israel, May 4, 2010
Managing High Performance Data Pipeline Execution with an FPGA Processor
Presenter: Ben Hor – Xilinx, Inc
Authors:Glenn Steiner, Dan Isaacs – Xilinx, Inc.David Pellerin – Impulse Accelerated Technologies
2Israel, May 4, 2010
Agenda
1. What is Control Plane/Data Plane Processing and Why Might I Need It?
2. FPGA’s Enable Balancing Computation Between a Processor and Application Specific Logic
3. Implementation of a Control/Data Plane System is Straightforward
4. Case Study: An HD Video Recognition System
5. Connecting the Embedded Processor to the FPGA with Linux
6. Summary
3Israel, May 4, 2010
What is Control Plane / Data Plane Processing and Why Might I Need It?
4Israel, May 4, 2010
Challenge Example: HD Video Streaming
• 720P 74.25 MHz Pixel Rate – 222.75 MBs data rate– Hypothetical Dual Core – 2.5GHz, Dual Issue
(2 instructions per clock)• 10 GHz Instruction Rate 22.4 instructions per byte of data processed
• What about OS overhead– Task switching times– Interrupt latency– All bus bandwidth eaten up with video data
• Can’t Do It With a Standard Processor
5Israel, May 4, 2010
Coprocessing: An Effective Way of Accelerating Software
• Distributes the load
• Move computational load where it belongs
• Dedicated processing element(s) provide dramatic acceleration
6Israel, May 4, 2010
A Look at Coprocessing Architectures
• Fully Decoupled– Common, but not interesting for this topic
• Single / Multi-Instruction Accelerator– FPU
• Loosely Coupled - Separated Functions– Message / Control Passing– Typically Used for Control Plane / Data Plane
Processing
7Israel, May 4, 2010
What is Control Plane / Data Plane
DataIn
DataOut
Control PlaneProcessor
(OS)UserInterface Processor Bus or
Dedicated Control Channel(s)
Coprocessor CoprocessorCoprocessor
Control Plane
Data Plane
8Israel, May 4, 2010
Control / Data Plane Example
• Control plane: controls the state of network elements– Route selection– RSVP, capability signaling, etc.– Exception handling
• Data plane: manages data packets – Packet forwarding– Packet differentiation– Buffering, link scheduling
Adapted from: Active correlation between the control and data plane – Z. Morley Mao
9Israel, May 4, 2010
FPGA’s Enable Computation Balancing Between a Processor and Application Specific Logic
10Israel, May 4, 2010
FPGAs: Ideal for Coprocessing
• Tight integration between FPGA & Processor– Reduced Latency– Matched clock rates
• Configure the processors to meet system requirements– Configure Processors– Configure the Coprocessors
• Flexible logic enables experimentation
11Israel, May 4, 2010
External Processor Challenges
• Latency for control signals to coprocessor• Pin challenges
– Many pins reduce latency but at higher power & part cost– High speed serial (PCIe) minimizes pins at cost of latency & power
• May not be the lowest cost solution
• FPGA embedded processors solve these challenges and enable performance balancing
12Israel, May 4, 2010
Implementation of A Control Plane / Data Plane System is Straight Forward
13Israel, May 4, 2010
Building The Control Plane / Data Plane System
• Assemble the Control Plane processor
• Assemble the Data Pipeline– Combining IP generated by multiple tools– C to HDL Tools may be an effective option
• Control the Pipe with Processor and OS
14Israel, May 4, 2010
Assemble the Control Plane Processor
15Israel, May 4, 2010
16Israel, May 4, 2010
17Israel, May 4, 2010
Multiple Languages/Tools/Flows to create Coprocessors
– Low Level• Hand Crafted - RTL (VHDL/Verilog)
– High Level• Matlab / Simulink • ‘C’ to FPGA (HDL)• ‘C’ Variants
Assemble and Connect the Data Plane
18Israel, May 4, 2010
CASE STUDY:HD VIDEO RECOGNITION SYSTEM
19Israel, May 4, 2010
The Case Study Problem
• 720P HD Video Stream– DVI Input and DVI Output
• Locate the clown fish in the video• Highlight the clown fish• Continuously track the fish• Adjust spotlight size based
upon likelihood of match
20Israel, May 4, 2010
The Architected Solution
• How Control Plane Processor Was Created
• How the Data Processing Pipeline Was Created
21Israel, May 4, 2010
Base Processor Reference Design
XilinxMicroBlazeProcessor
Block RAM
SystemAceCompact Flash
ICC
GPIOLEDs
GPIODIP Switch
DebugModule
UART
MultiportMemory
Controller
DDR2Memory
GPIOPush Buttons
ClockGenerator
ResetModule
Linux
22Israel, May 4, 2010
DVI Pass-through Reference DesignBasic “real-time” video processing
DVI Input
DVI Output
Image Processing
23Israel, May 4, 2010
DVI Pass-through Reference DesignBasic “real-time” video processing
DVI InGamma
In2D FIR Filter
DVI OutDVI In DVI OutDE GenGamma
Out
Image Processing
DVI Input
DVI Output
Streaming pixel processing Streaming video data MicroBlaze controls filter coefficients in “real-time”
Simple design example for customer IP integration
System Generator Custom video accelerator pcore
24Israel, May 4, 2010
Integrated Control/Data Plane System
DVIDVIIn
GammaIn
GammaOut
DVIOut
XilinxMicroBlazeProcessor
System
The processor is used to dynamically configure
filters
Processor Local Bus (PLB)
2D FIRFilter
DVI
Filtercontrol(UART)
ObjectDetection
New Pipeline Element
25Israel, May 4, 2010
HD Object Detection & Highlighting
26Israel, May 4, 2010
Connecting the Embedded Processor to the FPGA with Linux
27Israel, May 4, 2010
Control the Pipe with Linux
• Linux is Now the #1 OS for Embedded FPGA Systems
• Newest Generation Is More “Real-Time”• Large Public Code Base• Mostly Free
• FPGA IO Drivers Available
28Israel, May 4, 2010
Configure Linux for the IO Device
1. // Load the custom driver into Linux kernelmodule_init(xll_example_init);
2. // Register driver to specific device number - 253err = register_chrdev_region(devno, 1, "custom_io_example");
bash# mknod /dev/custom_io_example0 c 253 0
29Israel, May 4, 2010
Controlling the Data Pipewith the Linux Application
1. // Open custom I/O device from Linux applicationint custom_io_ex_open(struct inode *inode, struct file *filp)
2. // Read / Write to custom peripheral I/O using standard Linux read/function function calls– ssize_t custom_io_ex_read(struct file *filp, char __user *buf, size_t
count, loff_t *f_pos)
– ssize_t custom_io_ex_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos)
30Israel, May 4, 2010
SUMMARY• FPGAs enable computational balancing between an FPGA
based processor and a data processing pipeline reducing development risks
• Offloading streaming data processing tasks to an FPGA data-plane processing pipeline can enable meeting performance objectives
• An FPGA based single chip control-plane and data-plane processing solution can reduce cost and development time
• Offloading enables Processor to handle multitude of other tasks
31Israel, May 4, 2010
Thank You
Glenn Steiner, Dan Isaacs – Xilinx, Inc.David Pellerin – Impulse Accelerated Technologies