18
1 Customizable Domain-Specific Customizable Domain-Specific Computing Computing Proposal for NSF “Expedition in Computing” Program Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong Point of Contact: Prof. Jason Cong [email protected] Participating Universities: Participating Universities: UCLA (lead), Rice, Ohio-State, and UC UCLA (lead), Rice, Ohio-State, and UC Santa Barbara Santa Barbara (Complete list of PI/Co-PI available inside) (Complete list of PI/Co-PI available inside)

1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong [email protected] Participating

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

1

Customizable Domain-Specific Computing Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” ProgramProposal for NSF “Expedition in Computing” Program

Point of Contact: Prof. Jason CongPoint of Contact: Prof. Jason [email protected]

Participating Universities:Participating Universities:

UCLA (lead), Rice, Ohio-State, and UC Santa BarbaraUCLA (lead), Rice, Ohio-State, and UC Santa Barbara(Complete list of PI/Co-PI available inside)(Complete list of PI/Co-PI available inside)

Page 2: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

3

Focus: Power/Energy Efficient ComputationFocus: Power/Energy Efficient ComputationCurrent Solution: ParallelizationCurrent Solution: Parallelization

Parallelization

Source: Shekhar Borkar, Intel

Page 3: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

4

Our Proposal: Beyond Parallelization – Our Proposal: Beyond Parallelization – Customizable Domain-Specific ComputingCustomizable Domain-Specific Computing

Parallelization

Customization

Adapt the architecture to application

Source: Shekhar Borkar, Intel

Page 4: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

5

Motivation and VisionMotivation and Vision A few factsA few facts

We have sufficient computing power for most applications Each user/enterprise need high computing power for only limited tasks in his/her

application-domain Application-specific integrated circuits (ASIC) can lead to 10,000x+ better power

performance efficiency, but too expensive to design and manufacture

Our vision and approachOur vision and approach A general, customizable platform for the given domain(s)

• Can be customized to a wide-range of applications in the domain with novel compilation and runtime systems

• Can be massively produced with cost efficiency• Can be programmed efficiently

Goal: A “supercomputer-in-a-box” with 100x performance/power improvement via Goal: A “supercomputer-in-a-box” with 100x performance/power improvement via customization for the intended domain(s)customization for the intended domain(s)

Analogy: Advance of civilization via specialization/customizationAnalogy: Advance of civilization via specialization/customization

Page 5: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

6

Application Domains: Medical Image Processing & Application Domains: Medical Image Processing & Hemodynamic SimulationHemodynamic Simulation Medical imaging has transformed healthcareMedical imaging has transformed healthcare

An in vivo method for understanding disease development and patient condition

Estimated to be $100 billion/year

More powerful & efficient computation can help

• Fewer exposure using compressive sensing with lower sampling frequency

• Better clinical assessment using improved registration and segmentation algorithms to provide quantitative measures of disease (e.g., cancer)

Hemodynamic simulation Hemodynamic simulation

Very useful for surgical procedures involving blood flow and vasculature

Both may take hours to days to constructBoth may take hours to days to construct

Clinical requirement: 1-2 minClinical requirement: 1-2 min

Intracranial aneurysm reconstruction with hemodynamics

Magnetic resonance (MR) angiography of an aneurysm

Page 6: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

7

compressive sensing

level set methods

fluid registration

total variational algorithm

Application Domains: Medical Image Processing PipelineApplication Domains: Medical Image Processing Pipelinede

noisi

ng

deno

ising

re

gist

ratio

nre

gist

ratio

nse

gmen

tatio

nse

gmen

tatio

nan

alys

isan

alys

is

h

zyS

i,jvolumevoxel

ji

S

kkk

eiZ

wjfwi

1

21

2

j

2, )(

1 ,2)()(u :voxel

)()()()( uxTxRuxTvv

uvt

uv

0t)(x, : xvoxels)(surface

div),(F

t

datat

3

12

23

1

),(

),()(

ji

j

ij

j ij

ij

i txfx

vv

x

p

x

vv

t

v

txfvpvvt

v

reco

nstru

ctio

nre

cons

truct

ion

voxels

2

points sampled

)(-ARmin

:theoryNyquist -Shannon classical rate aat sampled be can and sparsity,exhibit images Medical

ugradSuu

Navier-Stokes

equations

Page 7: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

8

compressive sensing

level set methods

fluid registration

total variational algorithm

Navier-Stokes

equations

Non-iterative, highly parallel, local & global communication sparse linear algebra, structured grid, optimization methods

parallel, global communicationdense linear algebra, optimization methods

local communicationsparse linear algebra, n-body methods, graphical models

local communication dense linear algebra, spectral methods, MapReduce

iterative, local or global communicationdense and sparse linear algebra, optimization methods

Application Domains: Medical Image Processing PipelineApplication Domains: Medical Image Processing Pipelinede

noisi

ng

deno

ising

re

gist

ratio

nre

gist

ratio

nse

gmen

tatio

nse

gmen

tatio

nan

alys

isan

alys

isre

cons

truct

ion

reco

nstru

ctio

n

• These algorithms have diverse These algorithms have diverse computation & communication computation & communication patternspatterns

• A single, homogeneous system A single, homogeneous system cannot perform very well on all cannot perform very well on all of these algorithmsof these algorithms

• Need architecture Need architecture customization and hardware-customization and hardware-software co-optimizationsoftware co-optimization

• Include many common Include many common computation kernels (“motifs”)computation kernels (“motifs”)

• Applicable to other domainsApplicable to other domains

Bi-harmonic registration (Using the same algorithm on all Bi-harmonic registration (Using the same algorithm on all platforms)platforms)

CPU (Xenon 2.0 GHz)CPU (Xenon 2.0 GHz)

1x 1x

~100 W~100 W

GPU (Tesla GPU (Tesla C1060)C1060)

93x93x

~150 W~150 W

FPGA (xc4vlx100) FPGA (xc4vlx100)

11x 11x

~5W~5W

3D median filter: For each voxel, compute the median of 3D median filter: For each voxel, compute the median of the 3 x 3 x 3 neighboring voxelsthe 3 x 3 x 3 neighboring voxels

CPU (Xenon 2.0 GHz)CPU (Xenon 2.0 GHz)

Quick select Quick select

1x 1x

~100 W~100 W

GPU (Tesla GPU (Tesla C1060)C1060)

Median of medians Median of medians

70x 70x

~140 W~140 W

FPGA (xc4vlx100) FPGA (xc4vlx100)

Bit-by-bit majority voting Bit-by-bit majority voting

1200x 1200x

~3 W~3 W

Page 8: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

9

Customizable Heterogeneous Platform (CHP)

Reconfigurable RF-I busReconfigurable optical busTransceiver/receiverOptical interface

Overview of the Proposed ResearchOverview of the Proposed Research

Domain characterizatio

n Application modeling

Design once Invoke many times

Domain-specific-modeling(healthcare applications)

Architecture

modeling

Page 9: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

10

CHP Creation – Design Space ExplorationCHP Creation – Design Space Exploration

Key questions: Optimal trade-off of efficiency & customizabilityWhich options to fix at CHP creation? Which to be set by CHP mapper?

Custom instructions & acceleratorsAmount of programmable fabric Shared vs. private acceleratorsCustom instruction selectionChoice of accelerators …

Custom instructions & acceleratorsAmount of programmable fabric Shared vs. private acceleratorsCustom instruction selectionChoice of accelerators …

Core parametersFrequency & voltageDatapath bit widthInstruction window sizeIssue widthCache size & configurationRegister file organization# of thread contexts…

Core parametersFrequency & voltageDatapath bit widthInstruction window sizeIssue widthCache size & configurationRegister file organization# of thread contexts…

NoC parametersInterconnect topology # of virtual channelsRouting policyLink bandwidthRouter pipeline depthNumber of RF-I enabled routersRF-I channel and bandwidth allocation…

NoC parametersInterconnect topology # of virtual channelsRouting policyLink bandwidthRouter pipeline depthNumber of RF-I enabled routersRF-I channel and bandwidth allocation…

Customizable Heterogeneous Platform (CHP)

$$ $$ $$ $$

FixedCore

FixedCore

FixedCore

FixedCore

FixedCore

FixedCore

FixedCore

FixedCore

CustomCore

CustomCore

CustomCore

CustomCore

CustomCore

CustomCore

CustomCore

CustomCore

ProgFabricProg

FabricProg

FabricProg

FabricProg

FabricProg

FabricProg

FabricProg

Fabric

Reconfigurable RF-I busReconfigurable optical busTransceiver/receiverOptical interface

Page 10: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

11

CHP Mapping – Compilation and Runtime Software Systems CHP Mapping – Compilation and Runtime Software Systems for Customizationfor Customization

Goal: Efficient compiler and runtime support to map domain-specific specification to customizable hardware

Adapt the CHP to a given application for drastic performance/power efficiency improvement

Domain-specific applications

Domain-specific applications

Abstract executionAbstract

execution ProgrammerProgrammer

Domain-specific programming model(Domain-specific coordination graph and domain-specific language extensions)

Source-to source CHP MapperSource-to source CHP Mapper

Application characteristics

CHP architecture models

C/C++ code

C/C++ front-endC/C++

front-end

Reconfiguring and optimizing back-endReconfiguring and optimizing back-end

Analysis annotations

Binary code for fixed & customized cores

Customized target code

RTL for prog fabric

RTL Synthesizer

(xPilot)

RTL Synthesizer

(xPilot)

C/SystemC behavioral spec

Performance feedback

Adaptive runtimeLightweight threads and adaptive configuration

Adaptive runtimeLightweight threads and adaptive configuration

Page 11: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

12

Center for Domain-Specific Computing (CDSC) Organization

UCLA Rice UCSB Ohio State

Domain-specific modeling Bui, Reinman, Potkonjak Sarkar, Baraniuk Sadayappan

CHP creation Chang, Cong, Reinman Cheng

CHP mapping Cong, Palsberg, Potkonjak Sarkar Cheng Sadayappan

Application modeling Aberle, Bui, Vese Baraniuk

Experimental systems All (led by Cong & Bui) All All All

ReinmanPalsberg Sadayappan Sarkar(Associate Dir)

VesePotkonjak

Aberle Baraniuk Bui Cong (Director)ChengChang

A diversified & highly accomplished team: 8 in CS&E; 1 in EE; 2 in medical school; 1 in applied math

Page 12: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

14

Milestones Year 1 Year 2 Year 3 Year 4 Year 5

Application

modeling

Form benchmark sets in medical imaging and hemodynamic & establish baseline results

Demonstration of benchmark sets on Prototype 1a

Model the benchmark sets on DSCG & DSLE and drive the CHP optimizations

Demonstration of benchmark sets on optimized CHP runtime environment

Evaluation of benchmark on final CHP and quantify the impact on real world clinical data

Domain-

specific

specification

Develop Domain Specific Coordination Graph (DSCG) with abstract metrics

Implementation of DSCG+DSLE executable models for benchmark sets;

Identification of abstract execution metrics to guide CHP exploration

Refinement of DSCG+DSLE executable models for benchmark sets

Public release of DSCG infrastructure and the DSCG+DSLE executable models for benchmark sets

CHP creation CHP hierarchical imulation Infrastructure

CHP initial design- space tuning; Domain- specific component synthesis & selection

Refinement of CHP design- space exploration with detailed simulation

CHP design- space exploration with full system simulation

System integration

CHP

mapping

Source-to-source CHP mapper for Prototype 1a,

Fine-grained task scheduling system with locality and load balance adaptations

Design of software reliability components

Reconfiguring and optimizing back-end transformations;

Phase-based adoptions in adaptive runtime

Support of software reliability

Demonstration of the full CHP mapping system on Prototypes 1a & 2

Experimental

systems

Initial CHP prototype with COTS components (Prototype 1a)

Prototype RF-I chip (Prototype 1b) with traffic generators and multicast

CHP testbed (Prototype 2) prototyping on FPGAs

CHP testbed tapeout (Prototype 2)

Full system integration and demonstration

Page 13: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

15

Milestones for Experimental PlatformsMilestones for Experimental Platforms Prototype 1a: Heterogeneous integration of off-Prototype 1a: Heterogeneous integration of off-

the-shelf CMPs + GPUs + FPGAs, e.g.,the-shelf CMPs + GPUs + FPGAs, e.g., Intel Xeon CPU + Xilinx V5 FPGA (via FSB) + Nvidia

Tesla GPU (via PCI-express 2.0) Initial HW platform for CHP compilation and runtime

system development

Prototype 1b: RF-interconnect prototypePrototype 1b: RF-interconnect prototype RF-I implementation at 45nm CMOS with multiple

digital cores/traffic generators Performance, power, and reliability study

Prototype 2: final CHP implementation for the Prototype 2: final CHP implementation for the proposed healthcare domainsproposed healthcare domains Single-chip integration or 3D integration

RF-I tape-out at IBM 90nm CMOS

Page 14: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

16

Integrated Research and EducationIntegrated Research and Education New courses planned based on the researchNew courses planned based on the research

“Architecture and Compilation for Domain-specific Computing” “Computational Techniques for Medical Imaging” “Programming Models and Application Development for Domain-specific Computing”

• With projects for new domain, e.g., scientific computing, VLSI CAD, and digital entertainment

May be jointly taught (multi-disciplinary) Developed and shared via Connexions (cnx.org), an open-access education platform now

with over 1M users/month (based at Rice)

Graduate student trainingGraduate student training Estimated around 18 students in total in four campuses Seminars and workshops on interdisciplinary research, career development, ethics,

entrepreneurship …

Undergraduate student trainingUndergraduate student training 10 summer research fellowship each year, via UCLA FOCUS, Rice AGEP and similar

programs

Outreach to high-school studentsOutreach to high-school students 5-7 high-school summer scholarship each year, via UCLA SMARTS programs

Page 15: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

17

Outreach Partner: Frontier Opportunities in Computing for Outreach Partner: Frontier Opportunities in Computing for Underrepresented Students (FOCUS)Underrepresented Students (FOCUS)

Aims to increase the number of under-Aims to increase the number of under-

represented minorities interested in represented minorities interested in

computing disciplines computing disciplines

Currently has 50 underrepresented Currently has 50 underrepresented

undergraduates:undergraduates: 23 in CS 27 in CSE

http://ceed.ucla.eduhttp://ceed.ucla.edu

2007 summer research poster competition

The first prize winner

Page 16: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

18

Outreach Partner: Science Mathematics Achievement and Outreach Partner: Science Mathematics Achievement and Research Technology for Students (SMARTS)Research Technology for Students (SMARTS) A six-week summer college preparation program A six-week summer college preparation program

at UCLA at UCLA Engage underrepresented students in science,

technology, engineering and math training

SMARTS activities SMARTS activities Course related activities

• Math courses (Intro to Statistics and AP Calculus Readiness)

• SAT preparation

Research activities

Will have CDSC faculty and graduate students Will have CDSC faculty and graduate students involved to serve as mentors and provide projectsinvolved to serve as mentors and provide projects

This year, SMARTS program has over 80 This year, SMARTS program has over 80 applicants applicants 30-35 will be admitted (due to limitation of

funding)

Page 17: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

19

Knowledge TransferKnowledge Transfer Main outcome of the projectMain outcome of the project

1. CHP prototypes

2. Compilation and runtime system for CHP mapping

3. Application drivers – original source code & modified code with domain-specific modeling

4. General methodology for customizable computing (mainly through publications)

#1 – 3 will be shared with the research community via web as they become available

Industrial partnersIndustrial partners Altera, IBM, Intel, Magma, Mentor Graphics, Nvidia, Xilinx More will be contacted and included if the project is officially funded

Campus partnersCampus partners UCLA Institute of Digital Research and Education (IDRE) Institute of Pure and Applied Mathematics (IPAM) UCLA Wireless Health Institute (WHI)

Technology transfer experienceTechnology transfer experience Impact via industrial partners: IBM, Intel, Xilinx … Startups: Aplus (acquired by Magma in 2003), AutoESL (Magma and Xilinx were investors)

Page 18: 1 Customizable Domain-Specific Computing Proposal for NSF “Expedition in Computing” Program Point of Contact: Prof. Jason Cong cong@cs.ucla.edu Participating

20

Why an Expedition Address a fundamental problem – energy efficient computingAddress a fundamental problem – energy efficient computing

What’s beyond parallelization? Our proposal – a transformative approach using customization

Many challenging research topicsMany challenging research topics Domain-specific modeling/specification Novel architecture & microarchitecture for customization Compilation and runtime software to support intelligent customization New research in testing, verification, reliability, etc in customizable computing

Integrated effort in modeling, HW, SW, & application developmentIntegrated effort in modeling, HW, SW, & application development Demonstration in a critical application domainDemonstration in a critical application domain

Healthcare has a significant impact to economy and society Can greatly benefit from customizable domain-specific computing