Upload
lily-stafford
View
217
Download
0
Embed Size (px)
Citation preview
High Performance Reconfigurable ComputingHigh Performance Reconfigurable Computing
Ann Gordon-Ross, Ph.D.Ann Gordon-Ross, Ph.D.NSF CHREC CenterNSF CHREC Center
Assistant Professor of ECE, University of FloridaAssistant Professor of ECE, University of Florida
(on behalf of faculty/staff of CHREC at UF, GWU, BYU, and VT)(on behalf of faculty/staff of CHREC at UF, GWU, BYU, and VT)
August 1, 2008August 1, 2008
2
What is CHREC?What is CHREC? NSF Center for High-Performance Reconfigurable Computing
Unique US national research center in this field, established Jan’07 Leading research groups in RC/HPC/HPEC @ four major universities
University of Florida (lead) George Washington University Brigham Young University Virginia Tech
Under auspices of I/UCRC Program at NSF Industry/University Cooperative Research CenterIndustry/University Cooperative Research Center
CHREC is supported by CISE & Engineering Directorates @ NSF CHREC is both a National Center and a Research Consortium
University groups serve as research base (faculty, students, staff) Industry & government organizations are research partners, sponsors,
collaborators, advisory board, & technology-transfer recipients
founding sites (2007-)
expansion sites (2008-)
3
Objectives for CHRECObjectives for CHREC Serve as foremost national research center in this field
Basis for long-term partnership and collaboration amongst industry, academe, and government; a research consortium
RC: from supercomputers to high-speed embedded systems
Directly support research needs of our Center members Highly cost-effective manner with pooled, leveraged resources and
maximized synergy
Enhance educational experience for a large set of high-quality graduate and undergraduate students Ideal recruits after graduation for Center members, many US
Advance knowledge and technologies in this field Commercial relevance ensured with rapid technology transfer
4
Research Interaction
Basic Applied/Development
University Industry
I/U Centers
NSF Model for I/UCRC NSF Model for I/UCRC CentersCenters
5
CHREC MembersCHREC Members1. AFRL Munitions Directorate 2. AFRL Space Vehicles Directorate3. Altera 4. AMD5. Arctic Region Supercomputing Center 6. Boeing7. Cadence 8. GE Aviation Systems9. Gedae10. Harris Corp. 11. Hewlett-Packard 12. Honeywell13. IBM Research 14. Intel 15. L-3 Communications16. Lockheed Martin MFC17. Lockheed Martin SSC18. Los Alamos National Laboratory19. Luna Innovations20. NASA Goddard Space Flight Center21. NASA Langley Research Center 22. NASA Marshall Space Flight Center23. National Instruments 24. National Reconnaissance Office25. National Security Agency26. Network Appliance27. Office of Naval Research 28. Raytheon 29. Rincon Research Corp. 30. Rockwell Collins 31. Sandia National Laboratory NM
>30 members with >40 memberships in
2008
>30 members with >40 memberships in
2008
BLUE = founding member since 2007
ORANGE = new member in 2008
* Underline = member supporting multiple CHREC memberships & students in 2008.
6
Arctic Region Supercomputing
Center
NASA Marshall
Altera
Intel
Hewlett-Packard
AFRL Munitions Dir.
Raytheon
Rockwell Collins
L-3 Communications
NASA Goddard
NSA
NRO
Rincon Research
Corp.
ONR
Luna Innovations
NASA Langley
Harris
Network Appliance
GE Aviation Systems
Boeing
Los Alamos National Lab
Sandia National Lab
Honeywell
National Instruments
IBM Research
Cadence
Lockheed Martin MFC
Gedae
CHREC (BYU)
CHREC (UF)
CHREC (GWU)
CHREC (VT)
Lockheed Martin SSC
AMD
AFRL Space Vehicles Dir.
7
University of Florida (lead) Dr. Alan D. George, Professor of ECE – Center Director Dr. Herman Lam, Associate Professor of ECE Dr. K. Clint Slatton, Assistant Professor of ECE and CCE Dr. Greg Stitt, Assistant Professor of ECE Dr. Ann Gordon-Ross, Assistant Professor of ECE Dr. Saumil Merchant, Post-doc Research Scientist
George Washington University Dr. Tarek El-Ghazawi, Professor of ECE – GWU Site Director Dr. Ivan Gonzalez, Research Scientist / Visiting Faculty in ECE Dr. Vickram Krishnanarayana, Dr. Proshanta Saha, and Dr. Harald
Simmler, Post-doc Research Scientists Brigham Young University
Dr. Brent E. Nelson, Professor of ECE – BYU Site Director Dr. Michael J. Wirthlin, Associate Professor of ECE Dr. Michael Rice, Professor of ECE Dr. Brad L. Hutchings, Professor of ECE
Virginia Tech Dr. Shawn A. Bohner, Associate Professor of CS – VT Site Director Dr. Peter Athanas, Professor of ECE Dr. Wu-Chun Feng, Associate Professor of CS and ECE Dr. Francis K.H. Quek, Professor of CS
CHREC features a strong team of >40 graduate students spanning the four university sites.
CHREC features a strong team of >40 graduate students spanning the four university sites.
CHREC FacultyCHREC Faculty
8
Membership Fee StructureMembership Fee Structure NSF provides base funds for CHREC via I/UCRC grants
Base grant to each participating university site to defray admin costs Industry and govt. partners support CHREC through memberships
NOTE: Each membership is associated with ONE university Partners may hold multiple memberships (supporting multiple students) at
one or multiple sites (AFRL/MD, GSFC, Honeywell, LANL, MSFC, NRO, NSA, & SNL in ‘08)
Membership Fee: $35K per annum Why $35K unit? Base cost of graduate student for one year
Stipend, tuition, and related expenses (IDC is waived, otherwise >$50K) Fee represents tiny fraction of budget (1-2%) & benefits of Center
CHREC budget projected at ~$3M/yr in 2008 Equivalent to >$10M if Center founded in govt. or industry
Each university invests in various costs of CHREC operations 25% matching of industry membership contributions (equip.) Indirect Costs waived on membership fees (~1.5× multiplier) Matching on administrative personnel costs
More bangfor your buck!
9
Center Membership Center Membership BenefitsBenefits Research and collaboration
Selection of project topics that membership resources support Direct influence over cutting-edge research of prime interest Review of results on semiannual formal basis & continual informal basis Rapid transfer of results and IP from projects @ ALL sites of CHREC
Leveraging and synergy Highly leveraged and synergistic pool of funding resources Cost-effective R&D in today’s budget-tight environment, ideal for ROI
Multi-member collaboration Many benefits between members e.g. new industrial partnerships & teaming opportunities
Personnel Access to strong cadre of faculty, students, post-docs
Recruitment Strong pool of students with experience on industry & govt. R&D issues
Facilities Access to university research labs with world-class facilities
e.g. RC testbed equipment from Alpha Data, Altera, Celoxica, Cray, DRC, GiDEL, MathStar, Nallatech, SGI, SRC, XDI, Xilinx, et al. plus custom
10
CHREC & OpenFPGACHREC & OpenFPGA
CHREC
ProductionUtilization
OpenFPGACommunity
Research context and support
Technology innovations
Stan
dard
s an
d Va
lidat
ion
Supp
ort a
nd D
irect
ionEm
erging challenges
Preproduction prototypesDiagram c/o
Dr. Eric Stahlberg
11
Education & OutreachEducation & Outreach CHREC is enabling advancements at all its sites
New & updated courses Degree curricula enhancements Student internship connections Visiting scholars
Example: new RC courses @ Florida site New undergraduate (EEL4930) & graduate (EEL5934)
dual-listed courses in RC began in Fall Term 2007 Lectures, lab experiments, research projects
Fundamental topics Special topics from research in CHREC
Supported by new RC teaching cluster Sponsored in part by $25K education grant from Rockwell Collins Cluster of 21 RC servers each housing Nallatech PCI-X card with
Xilinx Virtex-4 LX100 user FPGA
12
Reformationsand RC
13
Architecture ReformationArchitecture Reformation End of wave (Moore’s Law) riding fclk + ILP (CPU)
Explicit parallelism & multicore the new wave Many promising technologies on new wave
Fixed & reconfigurable multicore device architectures Many R&D challenges lie on new wave
Tried & true methods no longer sufficient; complexity abounds Semantic gap widening between applications & systems
e.g. App developers must now understand & exploit parallelism Inherent traits of fixed device architectures
App-specific: inflexible, expensive (e.g. ASIC) App-generic: power, cooling, & speed challenges (e.g. Opteron) Many niches between extremes (Cell, DSP, GPU, NP, etc.)
Reconfigurable architectures promise best of both worlds Speed, flexibility, low-power, adaptability, economy of scale, size Bridging embedded & general-purpose computing, superset of fixed
14
What is a Reconfigurable What is a Reconfigurable Computer?Computer? System capable of changing hardware structure to address application demands Static or dynamic reconfiguration Reconfigurable computing, configurable computing,
custom computing, adaptive computing, etc. Often a mix of conventional fixed & reconfigurable
devices (e.g. control-flow CPUs, data-flow FPLDs)
Enabling technology? Field-programmable multicore devices FPGA is “King” (but space is broadening)
Applications? Vast range – computing and embedded worlds Faster, smaller, less power & heat, adaptable &
versatile, selectable precision, high comp. density
Performance
Flexibility
General-PurposeProcessors
ASICs
Special-Purpose Processors
(e.g. DSPs, NPs)
ReconfigurableComputing
(e.g. FPGAs)
FPGAECA
FPCAFPOAMPPATILEXPPet al.
FPGAECA
FPCAFPOAMPPATILEXPPet al.
15
Opportunities for RC?Opportunities for RC?
From Satellites to Supercomputers!
From Satellites to Supercomputers!10-100x speedups with 10-100x speedups with 2-10x energy savings 2-10x energy savings
not uncommonnot uncommon
16
When and Where to Apply When and Where to Apply RC?RC? When do we need?
When performance & versatility are critical Hardware gates targeted to application-specific requirements System mission or applications change over time
When the environment is restrictive Limited power, weight, area, volume, etc. Limited communications bandwidth for work offload
When autonomy and adaptivity are paramount Where do we need?
In conventional servers, clusters, and supercomputers (HPC) Field-programmable hardware fits many demands High DoP, finer grain, direct data-flow mapping, bit manipulation,
selectable precision, direct control over H/W (e.g. perf. vs. power) In space, air, sea, undersea, and ground systems (HPEC)
Embedded & deployable systems can reap many advantages w/ RC
Performance
Power
17
Multicore/Many-Core Multicore/Many-Core TaxonomyTaxonomy
17
Devices with segregated RA & FA resources; can use either in stand-alone mode
Devices with segregated RA & FA resources; can use either in stand-alone mode
Spectrum of Granularity In Each ClassSpectrum of Granularity In Each Class
MCRiding the new Riding the new MC wave of MC wave of Moore’s LawMoore’s Law
18
Reconfigurability FactorsReconfigurability Factors
18
Register
++
xx
8x8 Multiply
(Processing Element)
64-bit Multiply
(Processing Element)
DDR2 SDRAM
RC Device
DDR2 Memory Controller
Datapath Device Memory PE/Block Precision
Interface Mode Power Interconnect
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE PE
MEM MEM
PE
PE1Prg-A
PE3Prg-C
PE4Prg-D
Register Register
Register Register
Register
xx
Register
Register
8x8 MAC
(Processing Element)
24-bit Multiply
(Processing Element)
RLDRAM Memory Controller
RLDRAM SDRAM
64 KB X 32 64 KB X 64
PE1Prg-A
PE2Prg-BPE2
Prg-A
PE3Prg-A
PE4Prg-A
PE
PE
PE
Performance
Power
PE PE
MEM MEM
19
Future ConvergenceFuture Convergence Rising development costs & other
factors drive convergence As seen in many other technologies
Device architecture convergence? Many-core driven by densities Heterogeneous?
Cell as initial example Intel and AMD both cite heterogeneous
MC in their future To extent complexity is manageable
Reconfigurable Performance + versatility Adaptive for many apps, missions Avoid limitations of fixed architectures Manage issues of heterogeneity
Source: ASIC Design in the Silicon Sandbox: A Complete Guide to Building Mixed-Signal Integrated Circuits. © 2006, the McGraw-Hill Companies.
20
Application ReformationApplication Reformation Dawn of reformation in application development methods
Driven by architecture reformation; complexity management Holistic concepts, methods, & tools must emerge
Semantic gap widening between apps & archs MC world (fixed or RC), explicit parallelism
Architectures increasingly complex to target by apps New to fixed MC world, familiar to RC/FPGA & HPC worlds
Optimizing compiler ≠ parallelizing compiler Domain scientist involved in comp. structure of their app
How do we bridge semantic gap? Focus upon computational fundamentals
Formal models, complexity management via abstraction, encapsulation Learn lessons from other engineering fields
e.g. aerospace engineers do not flight-test first, why must we? Build basis for an RC engineering discipline
Leverage where practical for fixed MC world
Elephant in Living RoomElephant in Living Room
21
Reformation in App Reformation in App DevelopmentDevelopment FDTE as formal model
I. Formulation Strategic design playground,
abstraction, prediction
II. Design Tactics, coding, details
III. Translation Conversion to executable form
IV. Execution Services, debug, optimization
Applies throughout computing We focus on RC, which involves
hardware & software design
I. Formulation
(a) Algorithm design exploration
(b) Architecture design exploration
(c) Performance prediction (speed, area, etc.)
II. Design
(a) Linguistic design semantics and syntax
(b) Graphical design semantics and syntax
(c) Hardware/ software codesign
III. Translation
(a) Compilation
(b) Libraries and linkage
(c) Technology mapping (synthesis, place & route)
IV. Execution
(a) Test, debug, and verification
(b) Performance analysis and optimization
(c) Run-time services
Spectrum of application development phases
22
ResearchActivities
23
2007 CHREC Projects 2007 CHREC Projects (Florida Site)(Florida Site)F1-07F1-07: Simulative Performance Prediction Before you invest major $$$ in new systems, software design,
& hardware design, better to first predict potential benefits
F2-07F2-07: Performance Analysis & Profiling Without new concepts and powerful tools to locate and resolve
performance bottlenecks, max. speedup is extremely elusive
F3-07F3-07: Application Case Studies RC for HPC or HPEC is relatively new & immature; need to
build/share new knowledge with apps & tools from case studies
F4-07F4-07: Partial Run-Time Reconfiguration Many potential advantages to be gained in performance,
adaptability, power, safety, fault tolerance, security, etc.
F5-07F5-07: FPLD Device Architectures & Tradeoffs How to understand and quantify performance, power, et al.
advantages of FPLDs vs. competing processing technologies
Performance Prediction
Performance Analysis
Application Case Studies & HLLs
Systems Architecture
Device Architecture
F1
F2
F3
F4
F5
Perfo
rman
ce, A
dapt
abilit
y, F
ault
Tole
ranc
e, S
cala
bility
, Pow
er, D
ensit
y
24
2007 CHREC Projects (GWU 2007 CHREC Projects (GWU Site)Site)G1-07-07: SW/HW Partitioning & Co-Scheduling
Algorithms and tools for profiling, partitioning, co-scheduling, and targeting of RC Systems
G4-07-07: High-Level Languages Productivity
Insight to understand underlying differences among available tools, guide programmer in choosing correct language, impact future HLL development
G5-07-07: Library Portability and Acceleration Cores
Framework for portable and reusable library of hardware cores populated with key modules for RC
Initial focus upon case studies in computational biology & medical imaging
25
2008 CHREC Projects2008 CHREC Projects University of Florida Site
F1-08: System-Level Formulation for Alg/Arch Exp Abstraction layer exploring complex alg. & arch. formulations
F2-08: Application Performance Analysis Extending run-time performance analysis to HLL-based RC apps
F3-08: Case Studies in Multi-FPGA Application Design New insight in multi-device apps, extend RAT prediction models
F4-08: Reconfigurable Fault Tolerance & Partial Reconfig. System-level FT, exploiting RTR and PR for dynamic response to rad hazards
F5-08: Device Characterization & Design Space Exploration Extending studies to broader range of RC devices (FPCA, ECA, TILE, etc.)
George Washington University Site G5-08: Library Portability for HLL Acceleration Cores
Exploring and defining portable interface framework (PFIF) G6-08: Intelligent Deployment of IP Cores
Identify HW tasks, deploy intelligently (grouping, IP interconnect) G7-08: Partial Run-Time Reconfiguration for HPRC
Explore PR for HPC apps to reduce RTR delay, HW virtualization
CPU 3CPU 2CPU 1CPU 0
904 MB/s88%
10 MB/s1%
812 MB/s79 %
914 MB/s89%
1.79GB /s72 %
6MB/s0%
2.50 GB/s100 %
0MB /s0%
0MB/s0%
FPGA 0 FPGA 1
2.76GB /s69%
CPU 4 CPU 5
1.98 GB /s99%
Network
210 MB/s10%
FPGA 2
Throughput (MB/s )
Time (sec )
IDLE75%
PHASE 19%
PHASE 216 %
Potential Bottlenecks CPU Interconnect
121 MB/s12%
691 MB/s67 %
26
2008 CHREC Projects2008 CHREC Projects Brigham Young University Site
B1-08: Core Library Framework for HPC/HPEC Framework for encapsulating details of reusable circuit cores
B2-08: Heterogeneous Architectures for HPEC RC Device characterizations with RC/Fixed hybrids (FPGA, Cell, GPU)
B3-08: High-Reliability RC Design Tools and Techniques Device-level FT, auto. insertion of SEU mitigation, SEU estimation & detection
B4-08: Reliable RC DSP/Comm Systems Application-specific techniques for DSP/communications system design
Virginia Tech Site V1-08: Model-Based Engineering Framework for HPRC Applications
Adapt model-based design methods for RC, feature SDR for case studies V2-08: Process-to-Core Mapping for Advanced Architectures
Explore process-to-core mappings for hybrid multicore and RC architectures
Library Standard
Coregen JHDL Vendor1 OpenFPGA
…Libraries
ToolsHLL (Matlab/Fortran) HLL (C/C++/ SysC)
32 data3 control 32 data3 control
Processor configuration
FPGAA
FPGAB
FPGAC
SRAM SRAM SRAM
32 data3 control 32 data3 control
Processor configuration
FPGAA
FPGAB
FPGAC
SRAM SRAM SRAM
PlatformIndependent
Models
ComputationIndependent
Models
Platform SpecificModels
29
Conclusions
30
ConclusionsConclusions RC making inroads in ever-broadening areas
HPC and HPEC; from satellites to supercomputers!
As with any new field, early adopters are brave at heart Face challenges with design methods, tools, apps, systems, etc. Fragmented technologies with gaps and proprietary limitations
Research & technology challenges abound Application FDTE, device/system arch., FT, RTR, PR, etc. CHREC sites and partners leading key R&D projects
Industry/university collaboration is critical to meet challenges Incremental, evolutionary advances will not lead to ultimate success Researchers must take more risks, explore & solve tough problems Industry & government as partners, catalysts, tech-transfer recipients
Formulation
Design
Translation
Execution
31
AcknowledgementsAcknowledgementsWe express our gratitude for support of CHREC by National Science Foundation
Program managers & assistants, center evaluator, panel reviewers CHREC Industry and Government Partners
>30 members holding >40 memberships in 2008 University administrations @ CHREC sites
University of Florida George Washington University Brigham Young University Virginia Tech
Equipment and tools vendors providing support Aldec, Altera, Celoxica, Cray, DRC, Gedae, GiDEL, Impulse, Intel,
Mellanox, Nallatech, SGI, SRC, Synplicity, Voltaire, XDI, Xilinx