16
1 12/9/04 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator: Principal Investigator: Dr. Alan D. George Graduate Research Assistants Graduate Research Assistants: David Bueno, Ian Troxel, Chris Conger, Adam Leko HCS Research Laboratory, ECE Department University of Florida

12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

Embed Size (px)

Citation preview

Page 1: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

112/9/04

Virtual Prototyping of Advanced SpaceSystem Architectures based on RapidIO

Sponsor:Sponsor:

Honeywell DSES Space, Clearwater, FL

Principal Investigator:Principal Investigator:

Dr. Alan D. George

Graduate Research AssistantsGraduate Research Assistants::

David Bueno, Ian Troxel, Chris Conger, Adam Leko

HCS Research Laboratory, ECE Department

University of Florida

Page 2: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

212/9/04

Outline

I. Project Overview

II. BackgroundI. RapidIO (RIO)

II. Ground-Moving Target Indicator (GMTI)

III. Synthetic Aperture Radar (SAR)

III. Partitioning Methods

IV. Modeling Environment and Models

V. Experiments and Results

VI. Conclusions

Page 3: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

312/9/04

Project Overview Simulative analysis of Space-Based Radar (SBR) systems based on

RapidIO interconnection networks RapidIO (RIO) is a high-performance, switched interconnect for embedded

systems High system scalability (boards, processors, memory units, sensors, etc.) Far better bisection bandwidth than existing bus-based technologies

Study optimal method of constructing scalable RIO-based systems for Ground Moving Target Indicator (GMTI) and Synthetic Aperture Radar (SAR) Identify system-level tradeoffs in system designs Discrete-event simulation of RapidIO network,

processing elements, and GMTI/SAR algorithms Identify limitations of RIO design for SBR Determine effectiveness of various GMTI/SAR

algorithm partitionings over RIO network

Image courtesy http://www.afa.org/magazine/aug2002/0802radar.asp

Page 4: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

412/9/04

Background- RapidIO Three-layered, embedded system interconnect architecture

Logical – memory mapped I/O, message passing, and globally shared memory Transport – controls routing of packet from source to destination Physical – serial and parallel

Point-to-point, packet-switched interconnect Peak single-link throughput ranging from 2 to 64 Gb/s Focus on 16-bit parallel LVDS RIO implementation for satellite systems

Image courtesy G. Shippen, “RapidIO Technical Deep Dive 1: Architecture & Protocol,” Motorola Smart Network Developers Forum, 2003.

Page 5: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

512/9/04

Background- GMTI

GMTI used to track moving targets on ground Estimated processing requirements range from

40 (aircraft) to 280 (satellite) GFLOPs GMTI broken into four stages:

Pulse Compression (PC) Doppler Processing (DP) Space-Time Adaptive Processing (STAP) Constant False-Alarm Rate detection (CFAR)

Incoming data organized as 3-D matrix (data cube) Data reorganization (“corner turn”) necessary between stages for processing efficiency Size of each cube dictated by Coherent Processing Interval (CPI)

PulseCompression

DopplerProcessing

Space-TimeAdaptive

Processing(STAP)

ConstantFalse Alarm

Rate(CFAR)

Receive Cube

Send Results

Corner Turn Partitioned along range dimension

Partitioned along pulse dimension

DATA CUBE

Ranges

Pu

lse

s

Page 6: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

612/9/04

GMTI – Parallel Partitioning Straightforward partitioning

Entire system works in parallel on a single data cube

All-to-all personalized communication used to perform corner-turn

Result latency must be ≤ 1 CPI

Staggered partitioning Processors work in small groups, one data

cube for each group Incoming data cubes are sent to groups via

round-robin distribution Result latency must be ≤ N × CPI

N = number of processor groups

Pipelined partitioning Processors work in small groups, each group

responsible for one stage of algorithm Corner turns “for free” Result latency must be ≤ N × CPI

N = number of stages

time

1 CPI

PE #4

PE #3

PE #2

PE #1

PE #4

PE #3

PE #2

PE #1

PE #4

PE #3

PE #2

PE #1

PE #4

PE #3

PE #2

PE #1

PC DP STAP CFAR

Straightforward Partitioning

Pipelined Partitioning

Data Cube0

Data Cube1

Data Cube2

Data Cube3

Data Cube4

Data Cube5

timestart

CPI 0 CPI 1 CPI 2 CPI 3 CPI 4

Staggered Partitioning

Page 7: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

712/9/04

SAR Algorithm Flow and Partitioning SAR composed of 7 sub-tasks

Processing of 2-D radar data Each sub-task’s optimal data partitioning varies Processed out of global memory

Required data gathering and image processing times are extensive 16-second Coherent Processing Interval (CPI) Gigabytes of data, processed in smaller chunks (kB to MB range) Long processing time places short deadline on tolerable time to

report results

Data size stays constant throughout algorithm

Range-Pulse Compression

Polar Reformatting

Pulse FFT

Range FFT

Pulse FFT

Auto-focus

Magnitude Function

Range dimension

Pulse dimension

Range/pulse blocks

Page 8: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

812/9/04

Model Library Overview RIO system modeling library created using Mission Level Designer (MLD), a

commercial discrete-event simulation modeling tool C++-based, block-level, hierarchical modeling tool

Algorithm modeling accomplished via script-based processing All processing nodes read from a global script file to determine

when/where to send data, and when/how long to compute

Our model library includes: RIO central-memory switch Compute node with RIO endpoint SBR traffic source/sink RIO logical message-passing layer Transport and parallel physical

layers

Model of Compute Nodewith RIO Endpoint

Page 9: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

912/9/04

GMTI Processor Board Models System contains many processor boards connected via backplane Each processor board contains one RIO switch and four processors Processors modeled with three-stage

finite state machine Send data Receive data Compute

Behavior of processors controlledwith script files Script generator converts high-level

GMTI parameters to script Script is fed into simulations

Model ofFour-Processor Board

Scriptgenerator

SimulationGMTI & system parameters

Processor scriptsend…

receive…

Page 10: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1012/9/04

System Design Constraints 16-bit parallel 250MHz DDR RapidIO link

(1 GB/s) Expected basis of RadHard component performance

by time RIO and SBR ready to fly in ~2008 to 2010 Systems composed of processor boards

interconnected by RIO backplane 4 processors per board 8 Floating-Point Units (FPUs) per processor One 8-port central-memory switch per board; implies

4 connections to backplane per board

Page 11: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1112/9/04

7-Board System

Backplane and System Models

High bandwidth requirements for GMTI algorithm dictate architecture for SAR systems Expectation that systems must eventually support both SAR and GMTI Same backplane efficiently supports four-, five-, six-, and seven-board configurations

Can use smaller switches on backplane to conserve power if fewer than seven boards needed Three-board system possible using similar configuration with only two backplane switches

4-Switch Non-blocking Backplane

Backplane-to-Board 0, 1, 2, 3 Connections

Backplane-to-Board 4, 5, 6, and Data Source/GM Connections

Page 12: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1212/9/04

GMTI Performance Results Studied RapidIO system

designs for space-based GMTI Important conclusions:

Non-blocking backplane extremely important for GMTI

GMTI not sensitive to latency of individual packets Cut-through routing

unnecessary in switches RapidIO transmitter- and

receiver-controlled flow control perform almost identically

Straightforward partitioning method provides lowest latency, but with least efficient use of resources

Staggered partitioning (by board) method is very efficient, but with very high latency

Pipelined method is a compromise

CPI Latencies

0

256

512

768

1024

1280

1536

32000 40000 48000 56000 64000

Number of ranges

La

ten

cy (

ms)

Straightforward, 7boards

Staggered, 5 boards

Pipelined, 6 boards

Pipelined, 7 boards

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Straightforward,7 boards

Staggered,5 boards

Pipelined,6 boards

Pipelined,7 boards

Para

llel E

ffici

ency

Page 13: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1312/9/04

SAR Performance Results Message-passing logical layer

involves higher overhead All operations have responses CPI latency is constant as chunk size

increases Low levels of contention in network for

all cases Logical I/O layer is inherently better

for current mapping (global-memory based) Approximately ½ of operations are writes,

which require no response Contention increases as chunk size

increases May be possible to use

synchronization to reduce contention, and double-buffering for latency hiding Small processing times reduce benefit of

double-buffering Smart use of double-buffering only on

certain tasks with long computation may maximize benefit of this approach

CPI Completion Latency vs. Chunk Size

170.000

180.000

190.000

200.000

210.000

220.000

64 128 256 512 1024

Per-Processor Chunk Size (KB)

Late

ncy

(ms)

MP

IO

CPI Completion Latency vs. Chunk Size

165

170

175

180

185

190

195

200

205

210

64 128 256 512 1024

Per-Processor Chunk Size (KB)

Re

sult

La

ten

cy (

ms)

2xBuffer

1xBuffer

Page 14: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1412/9/04

Conclusions RapidIO is an emerging high-performance interconnect technology that

promises to be an important enabling-technology for future generations of latency- or throughput-bound applications Strong industry support, but very little scholarly research on RapidIO to date

Powerful SBR/RapidIO simulation and modeling capabilities developed at UF GMTI models SAR models RapidIO “Construction Set” Altia-MLD interface library

GMTI is extremely network-intensive Various partitioning options, with straightforward partitioning producing lowest latency Staggered partitioning has highest latency but uses network most efficiently

Network requirements of SAR easily handled by networks designed for GMTI Biggest challenge of SAR is memory requirements

Could benefit from further in-depth study on processor/memory architecture issues RIO Logical I/O layer found to benefit SAR CPI completion latency via responseless write operations

Well-suited to distributed-memory approach Double-buffering may improve performance of SAR application

Increases local memory requirements by factor of 2 to 3 Cut-through routing significantly benefits neither GMTI nor SAR in studies thus far

Experimental testbed for RIO system research is under development First step will be 2-node RIO testbed: Xilinx RIO cores and two small FPGA boards recently acquired Looking into options to build larger and more powerful testbed (comments welcome)

Page 15: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1512/9/04

Project Publications in 2004

RIO Systems Paper at HPEC’04 D. Bueno, C. Conger, A. Leko, I. Troxel and A. George, “Virtual

Prototyping and Performance Analysis of RapidIO-based System Architectures for Space-Based Radar,” Proc. of High-Performance Embedded Computing (HPEC) Workshop, MIT Lincoln Lab, Lexington, MA, Sep. 28-30, 2004.

RIO Networking Paper at LCN’04 D. Bueno, A. Leko, C. Conger, I. Troxel, and A. George,

“Simulative Analysis of the RapidIO Embedded Interconnect Architecture for Real-Time, Network-Intensive Applications,” Proc. of 29th IEEE Conference on Local Computer Networks (LCN) via the IEEE Workshop on High-Speed Local Networks (HSLN), Tampa, FL, Nov. 16-18, 2004.

Page 16: 12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:

1612/9/04

Future Work Potential follow-ons to current RapidIO work

Produce “Day in the life of SBR” simulation (with SAR and GMTI)

Develop GMTI-specific Altia GUI (the current Altia simulation is SAR-specific)

Develop “Day in the life of SBR” GUI Explore ways to increase the performance of SAR through

double-buffering and synchronization Examine the effects of per-node memory size on SAR and

GMTI traffic Development of RapidIO system testbed Couple RIO work with new research in reconfigurable

computing (RC) for advanced space systems Future projects are in planning stages